Markdown for Agents: Serving a Machine-Readable Web

Q: What is Markdown for Agents?

Markdown for Agents is the practice of serving a clean Markdown version of a web page to AI crawlers and agents instead of full HTML. Markdown strips navigation, scripts and layout markup, leaving pure content — which Cloudflare measured as roughly an 80% token reduction (16,180 HTML tokens down to 3,150). Agents request it via the HTTP Accept: text/markdown header, or by fetching a per-page .md URL.

Q: Is serving different content to bots cloaking?

No, when done through content negotiation. Cloaking means showing materially different content to deceive crawlers. Content negotiation serves the same content in a different format based on the client's Accept header — an HTTP standard the web has used for decades. The Markdown is a faithful representation of the HTML page, with a canonical link back to it, so it informs rather than deceives.

Q: How do I serve Markdown versions of my pages?

Three common approaches: generate a per-page .md file at build time (the gold standard — always current, with YAML frontmatter and a canonical link), configure content negotiation at your server or CDN so Accept: text/markdown returns the .md, or enable Cloudflare's Markdown for Agents, which converts HTML to Markdown at the edge on the fly. Add a discovery tag — link rel=alternate type=text/markdown — so agents know the Markdown exists.

When an AI retriever fetches your page, it pays for every token of your HTML — including the navigation, the cookie banner, the analytics scripts and the dozen nested <div>s wrapping a single paragraph. None of that helps it understand or cite you. It's pure overhead, and in a token-metered world, overhead is cost and noise.

There's a clean fix the web is converging on: serve machines Markdown instead.

Why Markdown for machines

Markdown is structure without cruft — headings, lists, tables, links, nothing else. Strip a page to it and the numbers are striking: Cloudflare benchmarked one blog post at 16,180 HTML tokens versus 3,150 in Markdown — about an 80% reduction. A heading that costs ~3 tokens in Markdown costs 12–15 in HTML markup.

For the model, fewer tokens means cheaper retrieval, cleaner extraction, and less chance of lifting a navigation label instead of your point. For you, it means your content is presented to AI in its most quotable form. This is the literal expression of everything GEO asks for: clear structure, no friction between the model and your words.

It's content negotiation, not cloaking

The instinctive worry: "isn't serving bots something different a penalty risk?" No — because this is content negotiation, a decades-old HTTP standard. The client sends an Accept header stating the format it wants; the server returns the same content in that format. Cloaking is showing materially different content to deceive a crawler. Serving a faithful Markdown rendering of the same page, with a canonical link back to the HTML, informs rather than deceives. Different format, same truth.

Three ways to serve it

1. Build-time `.md` files (gold standard)

Generate a Markdown sibling for every page at build: /index.md, /about.md, /articles/post.md. Each carries YAML frontmatter (title, description, canonical URL, date) and clean body Markdown. Because it's generated from the same source, it's always current. This is what this site does — a build script walks every HTML page, extracts the content, and writes the .md.

2. Content negotiation at server/CDN

Configure your server so a request with Accept: text/markdown returns the .md for the same URL. The human URL and the agent URL are identical; only the format differs by what the client asks for. (That's how this site is wired: curl -H "Accept: text/markdown" https://aiovsseo.com/ returns Markdown.)

3. Cloudflare "Markdown for Agents"

If you're on Cloudflare, the Markdown for Agents feature (2026) converts HTML to Markdown at the edge on the fly when an agent sends Accept: text/markdown — or when it appends .md / index.md to the URL. It's a zone-level toggle, free in beta on Pro/Business/Enterprise. Zero build work; the conversion happens on Cloudflare's network, not your origin.

Make it discoverable

Tell agents the Markdown exists with a discovery tag in your HTML <head>:

<link rel="alternate" type="text/markdown" href="/articles/post.md">

This pairs with your root llms.txt (a curated map) and your Content-Signal rules. Keep the Markdown clean — no nav, no scripts — and keep a consistent heading hierarchy so retrievers extract whole chunks.

Translate at build, multiply your reach

Once a build step is generating Markdown, it's a small leap to generate translations too. A script at build/deploy can run each page (or its Markdown) through translation and emit localized versions — /fr/…, /es/… — paired with hreflang tags so search engines map them. Because AI answer engines work per-language and largely independently, a credible translated corpus is a cheap way to become citable in markets your English pages never reach. Two cautions: don't ship raw machine translation for high-stakes or YMYL content without review, and keep one source of truth so translations regenerate cleanly rather than drifting.

Where `.well-known` fits

As these machine surfaces multiply — Markdown endpoints, translations, an API, an MCP server — agents need a predictable place to discover them. Today the conventions are the per-page discovery tag and llms.txt at the root; tomorrow more of this advertising may consolidate under /.well-known/, alongside the emerging /.well-known/mcp.json manifest. The principle is the one that's held since RFC 8615: give machines a known door to learn how to consume you. Markdown is simply a cleaner thing to hand them once they knock.

HTML answered one question for twenty years: how should this look to a person? Markdown answers the new one: how should this read to a machine? In 2026 you need both answers — from one source of truth.

Frequently asked questions

What is Markdown for Agents?

Serving a clean Markdown version of a page to AI crawlers and agents instead of full HTML. Markdown strips navigation, scripts and layout, leaving pure content — Cloudflare measured ~80% fewer tokens (16,180 HTML → 3,150 Markdown). Agents request it via Accept: text/markdown or by fetching a per-page .md URL.

Is serving different content to bots cloaking?

No, when done via content negotiation. Cloaking shows materially different content to deceive crawlers. Content negotiation serves the same content in a different format based on the Accept header — a long-standing HTTP standard. The Markdown faithfully represents the HTML, with a canonical link back, so it informs rather than deceives.

How do I serve Markdown versions of my pages?

Three ways: generate per-page .md at build time (gold standard — always current, with frontmatter and canonical), configure content negotiation at your server/CDN so Accept: text/markdown returns the .md, or enable Cloudflare's Markdown for Agents (edge conversion). Add a link rel=alternate type=text/markdown discovery tag so agents know it exists.

Markdown for agents: serving a machine-readable web

Why Markdown for machines

It's content negotiation, not cloaking

Three ways to serve it

1. Build-time `.md` files (gold standard)

2. Content negotiation at server/CDN

3. Cloudflare "Markdown for Agents"

Make it discoverable

Translate at build, multiply your reach

Where `.well-known` fits

Frequently asked questions

The .well-known directory

The machine layer

Markdown for agents: serving a machine-readable web

Why Markdown for machines

It's content negotiation, not cloaking

Three ways to serve it

1. Build-time .md files (gold standard)

2. Content negotiation at server/CDN

3. Cloudflare "Markdown for Agents"

Make it discoverable

Translate at build, multiply your reach

Where .well-known fits

Frequently asked questions

The .well-known directory

The machine layer

1. Build-time `.md` files (gold standard)

Where `.well-known` fits