Google’s Jeff Dean: AI Search relies on classic ranking and retrieval

In the rapidly evolving landscape of artificial intelligence, there is a common misconception that the advent of Large Language Models (LLMs) has completely rewritten the rules of information retrieval. Many observers assume that Google’s transition toward AI-driven results, such as AI Overviews, represents a total abandonment of the “old” search algorithms that have governed the web for decades. However, according to Jeff Dean, Google’s Chief AI Scientist, the reality is far more grounded in tradition than many realize.

In a detailed interview on the Latent Space: The AI Engineer Podcast, Dean pulled back the curtain on the architecture powering Google’s modern AI search experiences. His insights reveal a critical truth for developers, SEO professionals, and tech enthusiasts: AI search is not a replacement for classic search infrastructure. Instead, it is a sophisticated layer that sits on top of a foundational system built on decades of ranking, retrieval, and indexing expertise.

The Architecture: Filter First, Reason Last

The core of Jeff Dean’s explanation centers on a concept that might surprise those who view AI as an all-knowing entity that “reads” the entire internet in real-time. He clarified that Google’s AI systems do not process the whole web simultaneously for every query. Instead, they follow a rigorous, multi-stage pipeline designed for efficiency and accuracy. Dean describes this as a “staged pipeline” that prioritizes filtering before any generative reasoning occurs.

Visibility in an AI-generated search result still depends entirely on a document’s ability to clear traditional ranking thresholds. If a piece of content does not make it into the broad candidate pool of search results through standard SEO and ranking signals, it has zero chance of being used by an LLM to synthesize an answer. In essence, the AI doesn’t find the content; the search engine finds the content, and the AI merely explains it.

The Candidate Pool: From Trillions to Thousands

To understand how this works at scale, we must look at the numbers Dean provided. The internet consists of trillions of tokens—fragments of data that make up the web. When a user enters a query, it is computationally impossible and wildly inefficient for a high-reasoning LLM to scan those trillions of tokens to find an answer.

Instead, Google uses “lightweight methods”—the classic retrieval systems—to narrow the field. This first pass identifies a subset of roughly 30,000 documents that are potentially relevant to the user’s intent. This initial culling is done in milliseconds using traditional signals. Dean explained that this process is about “down-ranking” the noise to find a manageable set of “interesting tokens.”

Reranking and Refining

Once the system has identified the top 30,000 candidates, it doesn’t stop there. Google applies increasingly sophisticated algorithms and signals to refine that list further. This is a tiered process where the cost of computation increases as the number of documents decreases. The system filters the 30,000 documents down to a few hundred, and eventually down to the final set—often around 10 to 100 documents—that are truly relevant to the specific task.

Dean refers to the user experience of AI search as an “illusion” of attending to the entire web. While it feels like the AI is searching the whole internet for you, it is actually only “paying attention” to the very small subset of data that the traditional ranking engine has already verified as high-quality and relevant. “You’re going to want to identify what are the 30,000-ish documents… and then how do you go from that into what are the 117 documents I really should be paying attention to?” Dean noted.

Matching Intent: Moving from Keywords to Meaning

One of the most significant shifts in search over the last several years has been the move from lexical matching (finding exact words) to semantic matching (understanding the meaning behind words). While LLMs have accelerated this trend, Dean pointed out that this evolution is not entirely new; it is a continuation of a journey Google started long ago.

In the early days of search, if a user typed “blue suede shoes,” the engine looked for pages that contained those exact three words. If a page used the phrase “azure leather footwear,” it might not show up, even though it was contextually identical. Today, thanks to LLM-based representations of text, Google can move beyond “hard” word overlap.

The Power of Topic Overlap

Dean explained that LLMs allow Google to evaluate whether a page—or even a specific paragraph within a page—is topically relevant to a query, even if the wording differs entirely. This shift places a premium on topical authority and comprehensive coverage. For content creators, this means that repeating a keyword five times is far less effective than explaining a concept so clearly that the system understands the subject matter’s intent.

This “softening” of the definition of a query allows Google to bridge the gap between how people think and how they type. By using LLM representations, the search engine can map the “meaning” of a query to the “meaning” of a document, creating a much more fluid and intuitive discovery process.

The 2001 Milestone: Why Query Expansion Changed Everything

To provide context for today’s AI advancements, Jeff Dean took a trip down memory lane to 2001. This was a pivotal year for Google, marking the moment when the company moved its entire index from physical disks into RAM (memory) across a massive fleet of machines.

Before 2001, adding extra terms to a user’s query was expensive. Every time Google wanted to look for a synonym, it required a “disk seek,” which added latency and slowed down the search for the user. Consequently, the engine had to be very selective about the terms it searched for.

Query Expansion in the Pre-LLM Era

Once the index was in memory, the technical constraints vanished. Google could suddenly take a three-word query from a user and “expand” it into 50 terms behind the scenes. If a user searched for “cafe,” the system could simultaneously look for “restaurant,” “bistro,” “coffee shop,” and “diner” without any performance penalty.

Dean emphasized that this was the true birth of semantic search, occurring more than two decades before the current LLM boom. “It was really about softening the strict definition of what the user typed in order to get at the meaning,” he said. The AI search we see today is the logical conclusion of this 23-year-old architectural shift. The underlying goal has always been the same: to understand intent rather than just matching strings of characters.

The Role of Freshness and Crawl Prioritization

Another “classic” element that remains vital in the AI era is index freshness. No matter how smart an LLM is, it is only as good as the data it has access to. Dean highlighted that one of the biggest transformations in Google’s history was increasing the “update rate” of the index.

In the early years, Google’s index might only refresh once a month. In the modern era, Google has built infrastructure capable of updating the index in under a minute. This is particularly crucial for news and trending topics. As Dean put it, “If you’ve got last month’s news index, it’s not actually that useful.”

Balancing Crawl Budget and Importance

Google uses complex systems to determine how often to crawl a specific page. This isn’t just about how often the page changes; it’s about the value of that change. Dean explained that even if a page changes infrequently, Google might still crawl it often if it is deemed highly important. The likelihood of change might be low, but the “value of having it updated” is high.

This explains why high-authority websites often see their new content indexed within minutes, while smaller, less authoritative sites may wait days or weeks. For AI Search, this freshness is the lifeblood of accuracy. If the retrieval engine hasn’t crawled the latest update to a story, the AI model on top of it will provide outdated or incorrect information.

Why the “Classic” System Still Matters for SEO and Tech

The biggest takeaway from Jeff Dean’s interview is that the fundamentals of search have not been rendered obsolete by AI. In fact, they have become more important because they act as the gatekeepers to the AI’s “attention.”

If you are a webmaster or a marketer, your strategy should not change simply because Google is using LLMs to generate answers. The “AI Overviews” and other generative features are essentially the “Top 10” results rewritten for clarity. To be the source that the AI cites, you must still master the basics:

Relevance: Does your content directly answer the user’s intent?
Authority: Is your site trusted enough to survive the initial filtering from 30,000 documents down to the final 100?
Freshness: Is your information up to date, ensuring that when Google’s crawlers look for the latest data, they find your site?
Structure: Is your content organized in a way that “lightweight” retrieval methods can easily identify its value?

The Symbiosis of Retrieval and Generation

The relationship between traditional search and AI is symbiotic. The retrieval engine provides the facts, the context, and the sources, while the LLM provides the synthesis, the tone, and the user-friendly interface. Without the classic ranking system, an LLM would be prone to even more hallucinations, as it would lack a grounded set of verified documents to reference.

Dean’s insights serve as a reminder that Google is, at its heart, an information retrieval company. While the “front end” of search is changing to become more conversational and generative, the “back end” remains a colossal exercise in sharding, indexing, and ranking. The “AI Search” we interact with today is the result of decades of engineering, from the 2001 move to in-memory indexing to the modern deployment of Gemini models.

Conclusion: The Future of Search is Iterative, Not Disruptive

While the tech world often looks for “disruptive” shifts that change everything overnight, Jeff Dean’s perspective suggests that Google’s path is one of iteration and integration. AI doesn’t bypass the ranking system—it sits at the very end of it, acting as the final polisher of a process that begins with trillions of tokens and ends with a helpful answer.

For those looking to stay ahead in the age of AI search, the message is clear: do not ignore the foundations. Technical SEO, high-quality content, and topical relevance remain the keys to the kingdom. The AI might be the one speaking to the user, but the classic search engine is the one deciding whose voice the AI uses.

By understanding this “filter first, reason last” architecture, we can better navigate the future of the web. As AI continues to evolve, it will only become more dependent on the quality of the data it retrieves. The competition to be the “source of truth” for the internet’s largest AI models is just the latest chapter in the long history of search—a history that Jeff Dean and Google are still writing using many of the same tools they perfected twenty years ago.