Google’s new Lighthouse “Agentic Shopping” audits now examine for the presence of an llms.txt file. The brand new experimental Lighthouse documentation frames llms.txt as a discoverability and effectivity sign for AI brokers, not a conventional crawling directive.
- The audits are a part of Chrome’s rising “Agentic Shopping” class, which evaluates whether or not websites are structured for machine interplay.
- This doc comes lower than per week after Google revealed new steering on optimizing for AI search options like AI Overviews and AI Mode, during which it stated you don’t want llms.txt recordsdata in a mythbusting part of its new information on optimizing for generative AI options.
What Lighthouse now checks. Lighthouse’s Agentic Shopping class evaluates “how properly your web site is constructed for machine interplay” utilizing deterministic audits, based on Google’s documentation. Among the many checks:
- WebMCP integration.
- Accessibility tree integrity.
- Structure stability by way of CLS.
- Presence of an llms.txt file.
Lighthouse checks for “the presence of a machine-readable abstract on the area root.” Google additionally defined why the file issues for brokers:
“With out llms.txt, brokers might spend extra time crawling the positioning to know its high-level construction and first content material.”
The audit class doesn’t produce a conventional Lighthouse rating (0-100). As an alternative, Google surfaces a fractional cross ratio together with cross/fail checks tied to agentic readiness alerts.
The stress. The brand new Lighthouse documentation doesn’t instantly battle with Google’s recommendation on optimizing your web site for generative AI options as a result of these audits give attention to AI brokers and browser instruments, not Google Search rankings. Nonetheless, seeing llms.txt talked about in Chrome’s personal readiness checks might trigger some SEOs to rethink earlier doubts concerning the file.
Agentic engine optimization. The Lighthouse audits additionally align with concepts Google Cloud AI engineering director Addy Osmani outlined in April round Agentic Engine Optimization. Osmani stated AI brokers with restricted context home windows might reduce off lengthy pages or miss necessary info buried too deep in content material. Amongst his suggestions:
- Cleaner semantic construction.
- Token-efficient content material.
- Markdown supply.
- llms.txt discovery layers.
- Functionality signaling recordsdata like AGENTS.md.
web optimization vs. llms.txt. Right here’s precisely what Google recommends in Mythbusting generative AI search: what you don’t must do:
- LLMS.txt recordsdata and different “particular” markup: You don’t must create new machine readable recordsdata, AI textual content recordsdata, markup, or Markdown to seem in generative AI search. Word that Google might uncover, crawl, and index many sorts of recordsdata along with HTML on an internet site: this doesn’t imply that the file is handled in a particular means.
Right here’s what Google’s John Mueller stated about Google utilizing llms.txt, in response to Lily Ray asking him on Bluesky “Hey @johnmu.com – should you can reply, many people are mentioning the irony that Google makes use of LLMs.txt recordsdata, plus markdown pages, regardless of additionally saying this stuff aren’t wanted for efficiency in search. May you share why Google may publish these recordsdata, if to not make crawling these pages/websites simpler for brokers? (I’m certain I’ll be getting this query a ton quickly!)”:
The quick reply is that it’s not completed for search. There’s extra to web sites than simply web optimization :-).
The longer & nuanced model is that it’s value separating “discovery” (discovering the web site or pages with a worldwide search engine) vs “performance” (there’s most likely a extra correct time period for this, however principally: as soon as somebody has discovered the web page, serving to them to greatest do the duty they wish to do).
Maybe that’s just like CTA’s on conventional pages? You don’t “do them” for web optimization (to be discovered), however should you’re chargeable for the web site total, guaranteeing a excessive “discovery fee” (web optimization) along with a excessive conversion fee is helpful to justify your work.
To get again to the builders.google.com web site, AI coding has gotten extremely popular, and these coding programs might be (I believe) environment friendly and correct with the code they produce if they will simply learn / parse reference materials, reminiscent of developer documentation.
In these circumstances, it will possibly assist to provide them a solution to perceive the context of the documentation they’re , in addition to a simplified model of the reference web page (eg, in markdown). OF COURSE they will learn HTML simply advantageous, so that is imo extra of a short lived crutch, maybe to avoid wasting tokens.
For non-developer websites, I don’t suppose this makes a lot sense, even with extra agentic visitors sooner or later (and should you examine your logs, you’re not getting a whole lot of that in the intervening time). Making a markdown model of a shoe’s specs shouldn’t be going to get you extra gross sales (rivals admire it tho).
And (I do know, no one reads this far), should you suppose that is necessary to organize for when brokers are all over the place: your web site (all websites) have far more necessary issues to do for web optimization than to organize for a possible future scenario which will or might not come. Prioritize wants earlier than desires.
What Google says brokers depend on. Past llms.txt, Google’s new Lighthouse class strongly emphasizes accessibility and interface stability. The documentation says brokers depend on the accessibility tree as their “major knowledge mannequin.” Lighthouse particularly evaluates:
- Programmatic labels for interactive components.
- Legitimate accessibility tree construction.
- Whether or not interactive content material is hidden from assistive programs.
- Structure stability by way of CLS.
Google additionally warns that dynamically registered WebMCP instruments and enormous DOM adjustments can have an effect on audit outcomes.
Why we care. Google says you don’t want llms.txt for Search, however Chrome is now checking whether or not the file exists. On the identical time, Google’s agentic instruments seem to favor websites which can be simpler for machines to learn and use, particularly websites with sturdy accessibility, steady layouts, and clear agent entry.
Google’s assist doc. Lighthouse agentic looking scoring
Dig deeper.
Search Engine Land is owned by Semrush. We stay dedicated to offering high-quality protection of selling subjects. Except in any other case famous, this web page’s content material was written by both an worker or a paid contractor of Semrush Inc.
