What Google Says vs. What It Does

Google published its first official AEO/GEO guide on May 15. The headline message: there is no such thing as Generative Engine Optimization or Answer Engine Optimization. It’s all still SEO. Don’t make markdown files for LLMs. Don’t bother with an llms.txt. Don’t overthink content “chunking.” Don’t over-focus on structured data.

Of course I agree with some of Google’s advice, especially the section on content quality. In Old Spam in New Clothes I argued many things being sold as “GEO services” are warmed-over content shortcuts that will land on the wrong side of an algorithm update eventually. That part of Google’s message is accurate enough.

The other half is harder to take at face value. Google has a long record of publicly denying mechanics that turn out to be central to how its ranking system actually works. Anyone making decisions in 2026 based on the new guide should remember that record. Here are four things Google denied, three places where the new guide already disagrees with Google’s own behavior, and one thing Google just plain gets wrong.

Four things Google denied that turned out to exist

A sitewide authority score

For years Google’s public position was that there is no “domain authority” inside Google, but the May 2024 Content Warehouse API leak surfaced an attribute called siteAuthority sitting inside the CompressedQualitySignals module, used during preliminary scoring alongside pandaDemotion and navDemotion.

It is the exact thing Google had been telling SEOs didn’t exist. Hobo Web’s annotated breakdown is the cleanest reference on the overall leak, and Vizion Interactive catalogs the gap between past statements and what the docs show.

Click data as a ranking input

Google reps spent a decade minimizing click signals. The leak named the system: NavBoost, plus a buffer called Glue, with explicit fields for goodClicks, badClicks, and lastLongestClicks, all run through a squashing function so high-volume queries can’t dominate. The DOJ antitrust trial got the same admission under oath. Pandu Nayak, then VP of Search Quality, called NavBoost “one of the important signals.” Eric Lehman, a Google engineer, testified that “clicks are the main signal used by Navboost.”

A trial exhibit went further: learning from user feedback is “perhaps the central way that web ranking has improved for 15 years.” According to the trial record, Google’s 13-month rolling window click model is so large that it’s equal to roughly 17 years of equivalent Bing data. For more on this subject, check out Hobo Web’s DOJ summary, SerpClix’s click-fields breakdown, or Danny Goodwin’s breakdown at Search Engine Land.

The sandbox

John Mueller and other Search Liaison voices spent a decade telling people there is no “sandbox” holding new domains back. The 2024 leak revealed an attribute called hostAge, which the documentation describes as being used “to sandbox fresh spam in serving time.” That is a sandbox.

They picked a different word, but shipped the same mechanism. For more on this see Reuben Yau’s contradictions list or Alisa Thorley’s leak summary.

Brand favoritism

Google’s line for years was that there is no big-brand boost, but the 2024 leak surfaced topic-specific whitelists for travel, elections, and COVID, plus an attribute called smallPersonalSite.

Independent academic auditing reached the same conclusion looking at results alone—no leaked factors were needed. A 2024 paper analyzing 221,863 search results across Brazil, the UK, and the US found Google’s news algorithm preferentially favors a small set of national outlets. Columbia Journalism Review covered the same dynamic.

The leak gave the mechanism a name, but audits had already shown the output.

You can quibble with any of the points above, but the pattern is the point. Google’s public messaging is the great and powerful Oz: a booming, confident voice telling SEOs what isn’t real. Behind the curtain is the actual machine—and one of the levers being pulled is “perhaps the central way that web ranking has improved for 15 years.”

This shows the company’s posture toward anyone trying to understand how the algorithm actually works.

Three places the new AEO guide already disagrees with Google

The new guide is two weeks old. Three contradictions are already on the table.

Google says GEO doesn’t exist, but Google’s recruiting team disagrees. A month before the guide came out, Google’s Large Customer Sales team posted a job opening for a “GEO Partner Manager, Performance Solutions” whose stated responsibility was to manage “Google’s engagement model from Generative Engine Optimization (GEO) discovery to formal ecosystem advocacy.” The listing was first surfaced by Search Engine Roundtable and covered again by Search Engine Journal. After the guide published, the listing was pulled. Either GEO is a real internal category at Google or someone in HR was wrong. The new guide insists on the second reading.
Google says don’t publish markdown for AI, but Google’s documentation team publishes markdown for AI. The guide tells publishers there’s no need to produce “machine-readable files, AI text files, markup, or Markdown” to be cited by Google’s AI features. Two weeks later, Google quietly added .md.txt versions of every Search Central documentation page with a dropdown to grab the markdown. John Mueller’s explanation was that the markdown is meant for AI coding assistants reading the docs, not for Google Search. That distinction may matter to Google, but it’s a distinction without a difference for a publisher trying to figure out whether AI systems consume their content better in markdown.
Google says skip llms.txt, but Chrome audits sites for it. The guide lists llms.txt among the tactics publishers don’t need. Then Chrome shipped Lighthouse 13.3 with an “Agentic Browsing” audit category that checks whether a site provides an llms.txt file and flags errors when retrieving it. Search Engine Journal, Search Engine Land, and The Query Post all called out this messaging gap. Google’s defense is that Search and agentic browsing are different surfaces. They are, and a publisher needs to be visible on both.

Chunking isn’t a tactic—it’s what RAG does to your content

There’s one place where the new guide isn’t just contradicted by Google’s own behavior, but is wrong on its face. The guide tells publishers not to “chunk” their content for AI systems. The problem, as Mike King argued at Search Engine Land, is that “chunking is what RAG systems do to your content, whether you optimize for it or not.” The question isn’t whether your content gets chunked. The question is whether the chunks are intelligible.

As King puts it: “a passage that focuses on one idea will, in nearly every measurable case, retrieve better than a passage that tries to cover three.” Google’s own published research on MUVERA, passage indexing, and pairwise passage selection assumes the same thing. Bing’s public AI guidance tells webmasters that “chunking/transformations must preserve meaning and claims used in the answer.” Only Google’s surface guidance pretends the unit of retrieval is still the whole page.

All that said, the practical takeaway here isn’t revolutionary. Have tight paragraph discipline, make one claim per passage, write subheads that name the claim, not just the topic. Entity tagging and schema markup help the chunker decide what matters. None of this requires buying “chunking consulting,” but it’s also foolish to ignore the mechanisms at play here.

What to do with this

Snake oil is still being sold. Bulk AI rewrites, prompt-injected pages, and scaled comparison content are old shortcuts in new packaging. Google’s guide is right to call out most of that, and Lily Ray’s vicious-cycle framing still applies.

At the same time, Google’s guide shouldn’t be used as a ceiling for your AI-surface work, especially for low-effort items that Google itself is pursuing.

Why not ship an llms.txt if it’s a matter of ticking a single checkbox? Chrome is auditing for it, and it’s potentially the path of least resistance for Anthropic, Perplexity, and the open-source agent ecosystem.
Make your content reachable in clean markdown when an AI assistant fetches it; the docs team at Google is doing that for its own pages for a reason. Cloudflare’s Markdown for Agents serves a clean .md version of every page automatically.
Continue investing in the things that worked before AI surfaces existed. Entity tagging, Knowledge Graph connections, schema that ties your archive to the real-world people and concepts you cover. Those compound across Search, News, and Discover as well as AI Overviews, AI Mode, and ChatGPT citations.
Be aware that tight, clear arguments not only resonate with human readers, but also make more sense to their machine counterparts. Clarity was already a good idea; now it’s vital to both comprehensibility and visibility.

Mike King put it this way: Google’s guidance on AI Search is “one opinion, and it’s the opinion of the company with the most to lose.”

In other words, Google’s guide is marketing dressed as documentation. The mechanics it downplays are already shipping in their other products. The tactics it tells you to skip happen to help its AI competitors.

Moreover, by saying “just do good SEO,” Google is hoping that SEO continues to be Google-centric. If the industry accepted that some new category existed outside traditional SEO, publishers might start investing in optimizing their content for platforms other than Google.

Google has always asked us to “pay no attention to that man behind the curtain,” now it also asks us to ignore the competition knocking at the door.

The Great and Powerful Google Has Spoken!