TL;DR
robots.txt is access control: it tells crawlers what they may and may not fetch (and, via Content-Signal, how AI may use it). llms.txt is a guide: a Markdown file describing your site and pointing agents to key content. One restricts; the other describes. robots.txt is a long-standing standard; llms.txt is an optional, emerging convention.
| Aspect | robots.txt | llms.txt |
|---|---|---|
| Purpose | Control crawler access | Describe & link content for agents |
| Format | Directives (User-agent, Allow, Disallow) | Markdown |
| Status | Established standard | Emerging, optional |
| Required by Google? | Honored for crawl control | No — not required |
| Location | /robots.txt | /llms.txt |
They are complementary, not alternatives. You can and often should have both.
robots.txt
The decades-old file that governs crawling. Modern usage adds a Content-Signal line to separate search indexing, AI retrieval and AI training permissions. This is your access-control layer.
llms.txt
A newer, optional Markdown file that hands AI agents a curated map of your most important content. It does not control access; it aids navigation. Google has said you don't need it to appear in generative search, so treat it as polish.
Bottom line
Use robots.txt to set the rules; optionally add llms.txt to make cooperative agents more effective. See what is llms.txt.
Frequently asked questions
Do I need both llms.txt and robots.txt?
robots.txt is the important one — it controls crawling and, via Content-Signal, AI usage. llms.txt is optional polish that helps cooperative agents navigate. Having both is fine; having only robots.txt is sufficient.
Can llms.txt block AI crawlers?
No. llms.txt only describes and links content; it has no access-control power. To govern crawlers, use robots.txt with Content-Signal directives.