ora
LeaderboardMethodResearchBlogJourney

What agents actually reach

ora research·Jun 16, 2026·5 min read

We watched agents work. Not in theory, not in a benchmark built to flatter the formats we ship for them - just real episodes of agents landing on websites and trying to get things done.

One question: what do they actually reach, in what order, and what do they do with it? The answer is a little uncomfortable for anyone who’s spent the last year shipping llms.txt files.

Agents lean hard on the human web. They reach for homepages and docs - the same pages a person would, in the same order. The purpose-built, agent-first files get used far less. But where those files exist and get reached, they convert better than anything else.

The bottleneck isn’t usefulness. It’s discovery. Here’s the evidence.

finding 01

Everything that isn’t docs falls off a cliff

How often each resource got used, once it existed on a site:

Usage rate when presentall sites

docs pages

88%

homepage

84%
the cliff

llms.txt

46%

.well-known/*

34%

openapi.json

17%

robots.txt / sitemap.xml

9%

agents.md

8%

llms-full.txt

5%
Usage rate when present, across all sites.

Docs and homepages live in one tier. Everything we built specifically for machines lives in another, far below. The drop from documentation to the rest is a cliff, not a slope.

finding 02

But agent-first formats win when they get a shot

Here’s the twist. This isn’t agents preferring prose. On the one site in the set that shipped the full stack - and linked it properly - the numbers invert:

The inversionall sites → full-stack site
openapi.json+55 pts when reached
all sites
17%
full-stack site
72%
.well-known/*+29 pts when reached
all sites
34%
full-stack site
63%
agents.md+34 pts when reached
all sites
8%
full-stack site
42%
Usage rate when reached: across all sites, versus the one site that ships the full stack and links it.

Same files. ~17% → ~72%. The format was never the problem.

Reachability was.
Agent run trace on ora.ai: a developer asks to integrate and find setup docs. The agent walks home, openapi, llms.txt, then fans out into .well-known/mcp, .well-known/agent-card, agents.md, llms-full.txt and docs in 17 steps, all reached.
TRACEora.ai ships the full stack and links it - so the agent reaches openapi, .well-known/*, agents.md and llms-full.txt in a single clean run. 17 steps, 1 search.

finding 03

Agents search to find their way, not the answer

Most homepage and docs visits came from prior brand knowledge - the agent already knew where it was going.

Web search showed up in ~38% of runs. But when it did, it mostly handed the agent a URL, not a fact. The task still got completed from on-site content.

Search is the map. The site is the destination.
Agent run trace on attio.com: evaluating the product and finding pricing across 38 steps with 6 searches interleaved between docs and help-reference pages; searches point toward URLs while the answer is built from on-site content.
TRACEattio.com, finding pricing: 38 steps, 6 searches. Search keeps pointing the agent at the next URL - docs and help pages - while the answer gets assembled from what’s on-site.

finding 04

AI-native files are reached late, and only by following links

Order of arrival, by turn:

Order of arrivalearlier → later
turn 1.0
homepage
turn 1.7
docs
turn 3.0
llms.txt / openapi.json
turn 4.6
agents.md
turn 6.7
llms-full.txt
Familiar pages first. Structured files many turns later - and only if something linked to them.

A file nothing points to gets reached late, or never. An llms.txt no page references isn’t a fallback. It’s a dead end.

finding 05

The homepage is the gateway, and the easiest place to break a run

Reached in ~84% of runs, almost always on turn one (~95%). It’s the front door for nearly every session.

Which makes it the most dangerous place to fail. Hide the navigation behind JavaScript and the agent goes blind from step one. Everything downstream depends on clearing this gate.

finding 06

Agents guess standard paths, so meet them where they look

Agents probe conventional URLs from habit. /pricing and /integrations get hit early, and they work when they exist.

But habit runs ahead of reality:

/api - guessed in ~11% of runs, found in ~2%.

Agents expect a convention most sites don’t serve. The fix is the cheapest one on this list: serve the paths agents already reach for.

Agent run trace on telnyx.com: a developer finds API docs and auth. The agent reaches docs then llms.txt and branches into .well-known, oauth and llms development pages; one probed .well-known/api-catalog node is highlighted as a wasted guess.
TRACEtelnyx.com, developer finding API docs: the run succeeds via the llms.txt branch - but watch the orange node. The agent probes .well-known/api-catalog on spec; when the convention is missing, that guess is a wasted reach.

finding 07

Whatever an agent reaches, it uses, so wrong content is worse than none

Once a file is fetched, it shapes the answer almost every time. Agents build from the first believable page they land on. That cuts both ways:

+Good content gets used, readily.

!A reachable page with stale or wrong info hurts more than no page at all, because the agent will trust it and run.

Accuracy isn’t hygiene. It’s load-bearing.

the one takeaway

Discovery, not merit

Agents behave like well-informed visitors: they arrive at the front door from memory, fan out to docs, and follow links from there. The files we build for them are genuinely effective. They just lose on discovery, not on merit.

So the work isn’t only better machine-readable formats. It’s making them reachable:

  1. Link to them from the pages agents already hit.
  2. Serve the conventional paths agents already guess - /api, /pricing, /integrations.
  3. Keep the homepage navigable without JavaScript.
  4. Treat every reachable page as gospel - because to an agent, it is.

Meet agents where they already look, and the formats built for them finally get their shot.

Want to see where you rank?

Run the same scan we ran on thousands of sites. Free, public, takes about 1 minute.

Scan your site →Explore the data
← all posts
Published Jun 16, 2026
© 2026 era labs. All rights reserved.
AboutBlogDocsPrivacyContact