The Evolution of Search: From Keywords to Entities
For over two decades, search engine optimization was largely a game of keywords, backlink profiles, and technical site performance. However, the rise of Large Language Models (LLMs) and generative AI has fundamentally altered the landscape. We are moving away from a world of “blue links” and toward a world of “entities.”
Search is shifting from surfacing a SERP (Search Engine Results Page) with simple links to AI Overviews, generative answers, and chat-style summaries. These systems do more than just find a page that contains a keyword; they collate content, summarize information, and provide direct answers. To get your content to appear in this new model, your site must be understood as a collection of entities—singular, unique things or concepts, such as a person, place, or event—and the specific relationships between them.
Schema markup, or structured data, is one of the few tools SEO professionals have to make those entities and relationships explicit. It serves as a bridge between the messy, unstructured prose of a human-readable webpage and the rigid, data-driven needs of an AI system. But does schema markup really benefit AI search optimization? Some claim it can triple your citations or dramatically boost visibility. In reality, the evidence is more nuanced. Let’s separate what is known from what is assumed and look at how schema actually fits into a modern AI search strategy.
How Schema Fits Into AI Search Now
In the era of generative AI, systems like Google’s Gemini and Microsoft’s Copilot do not just “read” your website like a human would. They process data to build a knowledge graph. For an AI to accurately represent your brand or answer a query using your data, three elements matter the most:
1. Entity Definition
An AI needs to know exactly what is on a page. Is the page about a specific product, a professional service, a person, or a news event? Schema allows you to define these entities clearly. By using specific types like Product, Service, or Organization, you remove the guesswork for the LLM. It no longer has to infer the subject matter; you have explicitly declared it.
2. Attribute Clarity
Once the entity is identified, the AI needs to know its properties. For a product, this includes the price, currency, availability, and user ratings. For an author, it includes their job title and area of expertise. Schema markup provides a standardized format for these attributes, ensuring that when an AI Overview extracts a price or a rating, it does so with 100% accuracy.
3. Entity Relationships
This is perhaps the most critical component for AI search. Entities do not exist in a vacuum. A product is offeredBy an organization; an article is authoredBy a person; a person worksFor a company. Using schema tags like sameAs also helps connect your site’s entities to established external sources like Wikipedia, LinkedIn, or official databases. This builds a web of trust and context that AI systems can follow.
When schema is implemented with stable values (@id) and a logical structure (@graph), it starts to behave like a small internal knowledge graph. AI systems won’t have to guess who you are or how your content fits together. Instead, they can follow explicit connections between your brand, your authors, and your topics.
How AI Search Platforms Use Schema
While the broader SEO community often speculates on how AI uses data, we have concrete confirmation from the two biggest players in the space. For these platforms, schema is confirmed infrastructure, not a theoretical advantage.
Google AI Overviews
In April 2025, the Google Search team explicitly stated that structured data remains essential in the AI search era. They confirmed that structured data gives an advantage in how content is interpreted and surfaced within AI Overviews. Because Google has spent years building its Knowledge Graph, it relies heavily on schema to verify the facts it presents in its generative summaries.
Microsoft Bing Copilot
Microsoft has been equally transparent. Fabrice Canel, a principal product manager at Microsoft Bing, confirmed in March 2025 that schema markup directly helps Microsoft’s LLMs understand content for Copilot. By providing structured data, you are essentially “pre-processing” your content for Bing’s AI, making it easier for the model to cite you as a source of truth.
The “Black Box” of ChatGPT and Perplexity
The situation is different for platforms like ChatGPT and Perplexity. While these tools are rapidly becoming search engines in their own right, they haven’t publicly confirmed exactly how they use schema. We don’t yet know if they preserve schema during their web crawling process or if they use it for data extraction. While LLMs are technically capable of reading JSON-LD (the format used for schema), it remains unclear if their search indices prioritize it. For now, optimizing for these platforms requires a focus on clear, authoritative prose, with schema serving as a secondary supporting layer.
Analyzing Research on Schema and AI
To understand the true impact of schema, we have to look at the data. Recent studies provide a reality check against the hype, showing that while schema is powerful, it is not a “magic button” for rankings.
The Citation Gap
A study conducted in December 2024 by Search/Atlas looked at the correlation between schema markup and citation rates in AI search results. Surprisingly, the study found no direct correlation. Sites with comprehensive, “perfect” schema did not consistently outperform sites with minimal or no schema.
This finding is vital for SEOs to understand: schema alone does not drive citations. LLM systems prioritize relevance, topical authority, and semantic clarity above all else. If your content is poorly written or irrelevant to the query, great schema won’t save it. Schema is an amplifier, not a replacement for quality.
The Extraction Accuracy Advantage
While schema might not guarantee a citation, it significantly improves the accuracy of the information extracted. A February 2024 study published in Nature Communications found that LLMs perform significantly better when given structured prompts with defined fields compared to unstructured instructions.
In the context of the web, schema markup is the equivalent of that structured form. It gives the AI a set of predefined fields—Brand, Price, Author, Topic—to map to. This reduces the risk of “hallucinations” or errors. When an AI extracts data from a page with clear schema, it is much more likely to get the facts right, which is essential for maintaining brand integrity in AI-generated answers.
Building a Robust Entity Graph with Schema
Most traditional SEO implementations of schema are “flat.” A site might have an Article tag on a blog post and an Organization tag on the homepage, but these two things aren’t connected. For AI search, this is a missed opportunity. To truly optimize for AI, you must connect these nodes into a coherent graph using the @id property.
An entity graph approach involves:
- A Stable Organization Node: Creating a permanent
@id(usually your homepage URL followed by #organization) that represents your brand across the entire site. - Connected Author Nodes: Defining your authors as
Personentities with their own stable IDs, showing theyworksForyour organization. - Interlinked Articles: Using schema to show that an article was
authoredByyour specificPersonnode andpublishedByyourOrganizationnode.
This creates a reusable, machine-readable map. Regardless of how the visual layout of your page changes, the underlying data remains a unified source of truth. If an AI system crawls your site and preserves the JSON-LD, it can immediately see the hierarchical relationships and the authority behind the content.
Traditional Schema vs. Entity Graph Schema
To visualize the difference, consider how these two approaches compare:
Traditional SEO Schema: Usually consists of a single @type object per page. It is often anonymous (no @id) and focuses on gaining rich snippets in Google’s search results. While it helps with Click-Through Rate (CTR), it offers minimal benefit for complex AI disambiguation.
Entity Graph Schema: Uses an @graph array of interconnected nodes. It utilizes stable @id URLs that can be referenced across the entire website. The primary benefit here is disambiguation—it tells the AI exactly which “John Smith” wrote the article and exactly which “Example Corp” published it. This significantly boosts extraction accuracy for AI systems.
Strategic Recommendations for Implementation
How should you prioritize your schema work in an AI-first world? The goal is to make your content as “parsable” as possible. Here is a roadmap for implementing schema that actually moves the needle.
Focus on Priority Schema Types
Not all schema types are created equal. Based on platform guidance from Google and Microsoft, you should prioritize the following:
- Organization: This is your brand’s digital birth certificate. It establishes who you are.
- Article / BlogPosting: Essential for content attribution. This tells the AI who is responsible for the information provided.
- Person: Crucial for E-E-A-T (Experience, Expertise, Authoritativeness, and Trustworthiness). Use this to link authors to their social profiles and professional history.
- Product / Service: If you are in e-commerce or lead generation, this provides the hard data (price, availability) that AI models need for comparison queries.
- FAQPage: AI models love Q&A formats. FAQ schema makes it incredibly easy for an AI to pull a direct answer from your page.
Reduce Ambiguity
AI models struggle with ambiguity. If you have two authors with the same name, or if your brand name is also a common noun, you need to use sameAs tags. Linking your schema nodes to authoritative third-party sources (like a LinkedIn company page, a Crunchbase profile, or an official Wikipedia entry) provides “external verification” for the AI.
Complement, Don’t Replace
It is a mistake to think that schema can compensate for weak content. Schema is metadata; it is information about information. If your actual article content is thin or lacks authority, the AI will likely ignore the schema. Your strategy should be: write the best possible content for humans, then use schema to explain that content to the machines.
Common Pitfalls to Avoid
As you expand your structured data, avoid these common mistakes that can hinder your AI search performance:
Disconnected Nodes: Adding schema to every page but failing to use @id to link them together. This forces the AI to re-learn who you are on every single page crawl.
Over-Optimization: Adding schema for things that don’t exist on the page. This can lead to “structured data penalties” from Google and can confuse AI models, leading to a loss of trust.
Ignoring Updates: Search engines and AI platforms frequently update their supported schema types. For example, the recent introduction of ProductGroup schema for variants is something many sites have yet to adopt.
Neglecting the “Entity Home”: Every brand needs an “Entity Home”—a single page (usually the About page or Homepage) that serves as the definitive source of truth for the brand’s identity. This page should have the most comprehensive Organization schema on the entire site.
The Future: Schema as Infrastructure
AI search is still in its infancy. With ChatGPT’s search functionality only launching in late 2024, the industry is still learning how these models index and retrieve information. Measurement is currently difficult because AI responses are “non-deterministic”—meaning they can change slightly every time a question is asked. This makes traditional rank tracking a challenge.
However, the direction of travel is clear. Search engines are becoming sophisticated reasoning engines. They no longer want to just give you a list of sites; they want to give you an answer. By providing structured, interconnected, and verified data via schema markup, you are essentially helping these engines reason about your brand more effectively.
Schema markup is not a magic bullet, but it is essential digital infrastructure. It won’t necessarily get you cited more often in every platform today, but it is one of the few levers you can pull that platforms like Bing and Google explicitly use to understand the world. The real opportunity lies in the combination of technical structured data, topical authority, and clear brand signals. When you align these three things, you position your brand to win in the AI search era—without the hype.
Conclusion: Implementing Schema for the Long Term
To succeed in this evolving landscape, view schema as a long-term investment in your site’s “machine-readability.” As LLMs become more integrated into our daily search habits, the sites that have clearly defined their entities and relationships will be the ones that these models trust and cite. Focus on building a clean entity graph, use stable identifiers, and always prioritize accuracy over volume. By doing so, you ensure that your content isn’t just “found”—it’s understood.