How schema markup fits into AI search — without the hype
The Evolution of Search: From Keywords to Entities For over two decades, search engine optimization was largely a game of keywords, backlink profiles, and technical site performance. However, the rise of Large Language Models (LLMs) and generative AI has fundamentally altered the landscape. We are moving away from a world of “blue links” and toward a world of “entities.” Search is shifting from surfacing a SERP (Search Engine Results Page) with simple links to AI Overviews, generative answers, and chat-style summaries. These systems do more than just find a page that contains a keyword; they collate content, summarize information, and provide direct answers. To get your content to appear in this new model, your site must be understood as a collection of entities—singular, unique things or concepts, such as a person, place, or event—and the specific relationships between them. Schema markup, or structured data, is one of the few tools SEO professionals have to make those entities and relationships explicit. It serves as a bridge between the messy, unstructured prose of a human-readable webpage and the rigid, data-driven needs of an AI system. But does schema markup really benefit AI search optimization? Some claim it can triple your citations or dramatically boost visibility. In reality, the evidence is more nuanced. Let’s separate what is known from what is assumed and look at how schema actually fits into a modern AI search strategy. How Schema Fits Into AI Search Now In the era of generative AI, systems like Google’s Gemini and Microsoft’s Copilot do not just “read” your website like a human would. They process data to build a knowledge graph. For an AI to accurately represent your brand or answer a query using your data, three elements matter the most: 1. Entity Definition An AI needs to know exactly what is on a page. Is the page about a specific product, a professional service, a person, or a news event? Schema allows you to define these entities clearly. By using specific types like Product, Service, or Organization, you remove the guesswork for the LLM. It no longer has to infer the subject matter; you have explicitly declared it. 2. Attribute Clarity Once the entity is identified, the AI needs to know its properties. For a product, this includes the price, currency, availability, and user ratings. For an author, it includes their job title and area of expertise. Schema markup provides a standardized format for these attributes, ensuring that when an AI Overview extracts a price or a rating, it does so with 100% accuracy. 3. Entity Relationships This is perhaps the most critical component for AI search. Entities do not exist in a vacuum. A product is offeredBy an organization; an article is authoredBy a person; a person worksFor a company. Using schema tags like sameAs also helps connect your site’s entities to established external sources like Wikipedia, LinkedIn, or official databases. This builds a web of trust and context that AI systems can follow. When schema is implemented with stable values (@id) and a logical structure (@graph), it starts to behave like a small internal knowledge graph. AI systems won’t have to guess who you are or how your content fits together. Instead, they can follow explicit connections between your brand, your authors, and your topics. How AI Search Platforms Use Schema While the broader SEO community often speculates on how AI uses data, we have concrete confirmation from the two biggest players in the space. For these platforms, schema is confirmed infrastructure, not a theoretical advantage. Google AI Overviews In April 2025, the Google Search team explicitly stated that structured data remains essential in the AI search era. They confirmed that structured data gives an advantage in how content is interpreted and surfaced within AI Overviews. Because Google has spent years building its Knowledge Graph, it relies heavily on schema to verify the facts it presents in its generative summaries. Microsoft Bing Copilot Microsoft has been equally transparent. Fabrice Canel, a principal product manager at Microsoft Bing, confirmed in March 2025 that schema markup directly helps Microsoft’s LLMs understand content for Copilot. By providing structured data, you are essentially “pre-processing” your content for Bing’s AI, making it easier for the model to cite you as a source of truth. The “Black Box” of ChatGPT and Perplexity The situation is different for platforms like ChatGPT and Perplexity. While these tools are rapidly becoming search engines in their own right, they haven’t publicly confirmed exactly how they use schema. We don’t yet know if they preserve schema during their web crawling process or if they use it for data extraction. While LLMs are technically capable of reading JSON-LD (the format used for schema), it remains unclear if their search indices prioritize it. For now, optimizing for these platforms requires a focus on clear, authoritative prose, with schema serving as a secondary supporting layer. Analyzing Research on Schema and AI To understand the true impact of schema, we have to look at the data. Recent studies provide a reality check against the hype, showing that while schema is powerful, it is not a “magic button” for rankings. The Citation Gap A study conducted in December 2024 by Search/Atlas looked at the correlation between schema markup and citation rates in AI search results. Surprisingly, the study found no direct correlation. Sites with comprehensive, “perfect” schema did not consistently outperform sites with minimal or no schema. This finding is vital for SEOs to understand: schema alone does not drive citations. LLM systems prioritize relevance, topical authority, and semantic clarity above all else. If your content is poorly written or irrelevant to the query, great schema won’t save it. Schema is an amplifier, not a replacement for quality. The Extraction Accuracy Advantage While schema might not guarantee a citation, it significantly improves the accuracy of the information extracted. A February 2024 study published in Nature Communications found that LLMs perform significantly better when given structured prompts with defined fields compared to unstructured instructions.