What 107,000 pages reveal about Core Web Vitals and AI search

The Evolving Relationship Between User Experience and Algorithmic Trust

As the digital landscape undergoes a dramatic transformation fueled by generative artificial intelligence, the rules governing search visibility are rapidly changing. Google’s integration of AI-led features, such as AI Overviews and AI Mode, has shifted how users discover information, raising critical questions about how search engines and AI systems select the sources they trust and cite.

For years, the SEO community has relied heavily on Core Web Vitals (CWV) as the clearest public proxy for measuring user experience (UX). The logic seems irrefutable: faster pages lead to better engagement signals, and AI systems, which prioritize quality and trustworthiness, should naturally favor content originating from websites with superior CWV scores. This underlying assumption—that technical perfection translates directly into a visibility boost—is what many SEO strategies are currently built upon.

However, logic must always yield to empirical evidence. To properly test this widely held hypothesis, a massive analytical effort was undertaken, spanning the performance metrics of 107,352 unique webpages that have demonstrated prominence within Google’s AI-driven search results. The goal was not simply to confirm whether CWV “matters,” but to dissect precisely *how* it influences AI visibility and whether it functions as a primary competitive differentiator.

The findings offer a nuanced conclusion that challenges prevailing wisdom: Core Web Vitals are crucial, but their role in the age of AI search is not what most technical SEO teams currently assume. They act less as a growth lever and more as a gatekeeper.

The Scope of the Investigation: 107,000 AI-Visible Pages

To accurately assess the correlation between page experience and AI performance, the analysis focused exclusively on content already demonstrating a high degree of AI visibility. This dataset of 107,352 webpages included documents that were frequently cited, summarized, or included in Google’s AI Overviews and dedicated AI Mode search environments.

By focusing on pages that have successfully passed the initial quality filters of AI systems, the research aimed to determine if subtle or significant differences in page speed and stability—measured by Largest Contentful Paint (LCP) and Cumulative Layout Shift (CLS)—could predict variations in AI performance rankings.

This approach moves beyond generalized site audits. It treats the problem at the page level, which is critical because AI models do not evaluate a website’s mean performance; they evaluate the quality and experience delivered by the specific document they are considering for retrieval or summarization.

Understanding Core Web Vitals in the AI Context

Before diving into the correlations, it is essential to recall what the primary CWV metrics represent:

Largest Contentful Paint (LCP): Measures perceived loading speed. It marks the point when the largest primary content element (image or block of text) on the page has fully loaded and is visible to the user.
Cumulative Layout Shift (CLS): Measures visual stability. It quantifies unexpected shifts in the layout during the page loading phase, which significantly degrades user experience.

In the traditional SEO environment, achieving ‘Good’ status across these metrics was associated with ranking boosts (or penalty avoidance). The hypothesis being tested here is whether that association holds true when the search results are mediated by advanced language models.

Why Distributions Matter More Than Scores

A fundamental challenge in CWV analysis is the tendency to rely on averages and simple pass/fail thresholds. Most SEO reporting tools consolidate thousands of URL metrics into a single, summary mean. However, this approach severely masks the reality of user experience across a large site.

The first crucial step in analyzing the 107,000 pages was to visualize the performance metrics as a distribution rather than a mean. This immediately exposed the limitations of averaged reporting.

The Skewed Reality of Largest Contentful Paint (LCP)

When LCP values for the dataset were plotted, the distribution revealed a pronounced heavy right skew. The majority of pages clustered comfortably within an acceptable performance range—often around or slightly above the recommended ‘Good’ threshold of 2.5 seconds. The median performance was broadly satisfactory.

However, the “long tail” of the distribution extended dramatically to the right, showing a small but significant proportion of extreme outliers. These were pages with horrendously slow load times, perhaps exceeding 5 or 10 seconds. While these pages represented a minority of the total population, their extreme poor performance exerted a disproportionate influence, pulling the overall site average (the mean) toward an undesirable score.

For an SEO strategist, this distinction is vital. A poor site average may suggest a systemic problem when, in reality, it may be caused by a small number of broken templates or highly complex, unoptimized pages. The vast majority of users visiting the median-performing pages are having an adequate experience.

Cumulative Layout Shift (CLS) Reflects Similar Extremes

Cumulative Layout Shift exhibited a related pattern. The overwhelming majority of pages recorded CLS scores near zero, indicating high visual stability. This suggests that for most content, major layout shifts are not an issue.

Yet, similar to LCP, a small minority of pages displayed severe instability, producing high CLS scores. This minority pulls the mean up, creating the false impression of a site-wide instability issue. Again, the mean failed to reflect the lived experience of the majority of users.

This distributional analysis clarifies a crucial point for AI systems: AI does not reason over these aggregated means. It processes individual documents. Before even discussing correlation, it’s clear that Core Web Vitals is not a single, monolithic signal; it is a varied distribution of behaviors across a mixed population of documents.

Analyzing the Correlation: Rank vs. Linear Relationships

Because the CWV data was unevenly distributed (non-normally distributed), traditional statistical measures like the Pearson correlation coefficient were inappropriate. A standard Pearson correlation assumes a linear relationship and a normal distribution, which would have misrepresented the findings.

Instead, the analysis utilized the Spearman rank correlation. This method is used to determine if there is a monotonic relationship between the variables—that is, whether pages that rank higher on CWV performance also tend to rank higher or lower on AI visibility, regardless of whether that relationship is perfectly linear.

If a strong link existed between technical performance and AI visibility, we would expect to see a clear correlation indicating that faster, more stable pages consistently appear more often or higher in AI Overviews and citations.

The Weak Negative Finding

The analysis revealed a definite relationship, but one that was notably weak and, counter-intuitively, negative in nature. This suggests that faster pages are *slightly* less likely to be penalized, rather than significantly more likely to be rewarded.

Largest Contentful Paint (LCP): The correlation ranged from -0.12 to -0.18, depending on the specific measurement of AI visibility used.
Cumulative Layout Shift (CLS): The correlation was even weaker, typically falling between -0.05 and -0.09.

In practical terms, these numbers indicate a minimal relationship. While visible when examining massive volumes of data, these correlations are not strong enough to suggest that technical perfection is a primary driver of AI success. They do not support the idea that shaving milliseconds off an already fast page will meaningfully boost its chances of being cited by an AI system.

The Absence of Upside and the Presence of Downside

The most important insight derived from the analysis is the shape of the relationship between CWV and AI performance. The data decisively rejects the notion that improving Core Web Vitals beyond recommended thresholds provides a significant, positive competitive advantage in AI-led search.

Pages with excellent CWV scores (e.g., LCP < 1.5 seconds) did not reliably outperform their peers with merely "acceptable" scores (e.g., LCP 2.4 seconds) when it came to AI inclusion or citation frequency. **Good performance does not create an upside advantage.**

However, the weak negative correlation is highly instructive and points to where CWV truly impacts AI visibility.

The Suppression Effect of Extreme Failure

The negative correlation highlights the fact that pages sitting in the extreme tail of poor CWV performance—the horribly slow or highly unstable pages—were significantly less likely to perform well in AI contexts. This is the crucial finding: **Severe failure creates disadvantage.**

Why do these extreme technical failures matter? It’s because poor CWV metrics at the page level are highly predictive of detrimental behavioral signals. Pages that load slowly or violently shift content often result in:

Higher Abandonment: Users quickly leave frustrating pages (high bounce rates).
Lower Engagement: Users spend less time on the page and interact less with the content.
Weakened Trust Signals: Poor UX undermines the perception of professionalism and authority.

These second-order behavioral signals—the actual user responses to the page experience—are precisely the kinds of reinforcement mechanisms that AI systems, directly or indirectly, rely upon to infer quality, trustworthiness, and utility. A technically broken page may contain excellent content, but the user experience fails to validate that quality, leading to suppression.

Core Web Vitals, therefore, do not act as a boost; they act as a constraint. They ensure that technical friction does not override content quality in the AI selection process.

Why ‘Passing CWV’ Is Now Table Stakes, Not a Differentiator

The expectation among many practitioners that CWV should be a strong positive differentiator fails to account for the current state of the web. In this large dataset of prominent content, the majority of pages already met or nearly met the recommended CWV thresholds, especially for stability (CLS).

When most of the competition clears a specific bar, clearing that bar no longer distinguishes a competitor. It simply means they are still in contention. Technical hygiene has become the baseline expectation for any professional publisher hoping to compete for visibility.

AI systems are tasked with answering complex queries and synthesizing information. Their core selection criteria revolve around semantic alignment, factual accuracy, authoritativeness, and alignment with established topic clusters. They are not choosing between sources because one loaded in 1.8 seconds versus 2.2 seconds. They are choosing based on which page offers the clearest explanation, validates its information with trusted sources, and ultimately satisfies the intent of the query.

CWV ensures that the page experience doesn’t actively undermine those higher-level content qualities. It’s an indispensable foundation, but it is not a substitute for content strategy.

Reframing Core Web Vitals as Risk Management for AI Strategy

The implication of analyzing 107,352 AI-visible pages is not that Core Web Vitals have lost their importance; rather, their strategic role must be completely redefined within the context of AI search.

In an environment increasingly dominated by generative AI, CWV should be treated as a strategic risk-management tool, not a competitive optimization target.

Strategic Reprioritization of Engineering Effort

The analysis strongly suggests that allocating significant engineering resources to chase incremental CWV gains on pages that already score “Good” or “Acceptable” is a low-return investment when the goal is increased AI visibility. Improving LCP from 2.2 seconds to 1.9 seconds is unlikely to alter the fundamental selection logic of an AI model.

Instead, resources should be aggressively targeted toward eliminating the technical debt represented by the “extreme tail.” Publishers should focus on identifying and fixing the small percentage of pages with catastrophically bad performance, as these are the ones generating the intensely negative behavioral signals that suppress trust and lead to poor AI outcomes.

This tactical shift moves technical optimization from a constant, site-wide pursuit of perfection to a surgical, data-driven campaign aimed at removing barriers to entry.

Protecting High-Value Content

In the AI era, where citation and retrieval are paramount, the focus must be on protecting the content that is most crucial for AI ingestion. Publishers must ensure that key transactional pages, definitive guides, and high-authority articles are not compromised by avoidable technical failures. CWV optimization, therefore, becomes an insurance policy guaranteeing that high-quality content is not eliminated from contention simply because of poor technical execution.

The objective is pragmatic optimization: achieving functional excellence to stay within the competitive arena, thereby allowing the true differentiators—content quality, E-E-A-T (Experience, Expertise, Authoritativeness, and Trustworthiness), and superior content structure—to take precedence in the AI evaluation process.

Core Web Vitals: A Gatekeeper, Not a Differentiator

Based on the quantitative analysis of performance across more than 107,000 webpages prominent in Google’s AI search features, the relationship between technical performance and AI visibility is real, but strictly limited.

There is no evidence of a strong positive correlation where excellent CWV performance reliably yields a competitive ranking advantage in AI Overviews or citations. Publishers cannot expect to outrank competitors in AI search purely through superior page speed.

However, a measurable negative relationship exists at the extremes. Severe technical failures—pages that are extremely slow or unstable—are strongly associated with poorer AI outcomes, likely mediated by the negative user behavior they generate. These poor signals suppress the page’s overall perceived quality and trustworthiness.

In the rapidly evolving landscape of AI-led discovery, Core Web Vitals must be understood as a critical gatekeeper. Technical compliance is essential for mitigating risk and ensuring content remains eligible for AI retrieval. Once this gate is cleared, the competitive battle shifts entirely to the value, clarity, and authority of the content itself.

For modern SEO strategy, this clarity is paramount: master the fundamentals of page experience to avoid algorithmic disadvantage, and then redirect focus and investment toward the high-quality content signals that truly differentiate success in the age of artificial intelligence.