llms.txt Generator: Create Your AI Crawler Configuration File
- llms.txt is a proposed standard for providing LLMs with a curated, markdown-formatted summary of your website. Think robots.txt, but for AI readability instead of access control.
- No major AI platform has adopted it yet. Google’s John Mueller called it “unnecessary.” But developer communities and agentic AI tools are picking it up.
- The file lives at yoursite.com/llms.txt and follows a specific Markdown format: H1 header, blockquote summary, and file lists under H2 sections.
- It takes 20 minutes to create. Zero downside. Potential upside as AI tools evolve.
- Don’t put anything in llms.txt that doesn’t match your on-page content. AI platforms distrust it for exactly that reason.
llms.txt is one of the most overhyped and misunderstood concepts in AI search optimization right now. SEO tools flag it as missing. WordPress plugins offer to generate it. SEO Twitter argues about it weekly.
The actual situation: it’s a proposal from Jeremy Howard (published September 2024) that no major AI platform has officially adopted. But it’s a useful concept, it’s gaining traction with developer tools and AI agents, and creating one takes 20 minutes.
This post gives you the full picture: what llms.txt is, why it exists, whether you should care, and how to create one.
Time to create a complete llms.txt file
Downside risk from implementing llms.txt
Major AI platforms officially using it (yet)
llms.txt: low effort, zero risk, potential upside
What llms.txt Is (And What It Isn’t)
llms.txt is a Markdown-formatted file placed at the root of your website (yoursite.com/llms.txt) that provides LLMs with a curated, readable overview of your site.
Here’s the problem it’s trying to solve: LLM context windows are too small to process entire websites. Your typical web page is packed with navigation, ads, JavaScript, footer links, cookie banners, and sidebar widgets. The actual content might be 30% of the page’s HTML. An LLM trying to understand your page has to wade through all that noise.
llms.txt gives the AI a clean shortcut. A summary of what your site is about, plus links to Markdown versions of your most important pages. No noise. Just signal.
Think of it as the difference between handing someone a messy filing cabinet and handing them a one-page table of contents with page numbers.
How It Differs from robots.txt and sitemap.xml
llms.txt
robots.txt tells bots what they can and can’t access. It’s about permissions. “Don’t crawl /admin/.” “Don’t index /staging/.”
sitemap.xml lists every indexable page on your site. It’s a complete inventory for search engines. No curation, no explanation, no context.
llms.txt is curated context. It doesn’t control access or list every page. It explains your site to an AI and points it toward the most useful pages in a format the AI can actually process. The official spec puts it clearly: llms.txt is “designed to coexist with current web standards” and “complement robots.txt by providing context for allowed content.”
The llms.txt Format (Spec Breakdown)
The file uses specific Markdown sections, in this exact order:
1. H1 Header (Required)
The name of your project or website. This is the only section the spec strictly requires.
# Metronyx AI
2. Blockquote Summary
A short summary with the most important background information about your site.
> Metronyx AI is an AI search optimization agency specializing in answer engine optimization (AEO), helping brands get cited by ChatGPT, Perplexity, and Google AI Overviews.
3. Detailed Information (Optional)
Zero or more Markdown sections (paragraphs, lists, but not headings) with more context about your project and how to interpret the linked files.
We publish original research on AI search visibility, provide free audit tools, and offer managed AEO services for B2B SaaS, e-commerce, and professional services companies.
4. File Lists Under H2 Headers
Markdown sections with H2 headings containing lists of URLs where the AI can find more detail. Each entry is a Markdown hyperlink with optional notes after a colon.
## Core Pages
- [About Metronyx AI](https://metronyxai.com/about/): Company background and team
- [Services](https://metronyxai.com/services/): AI search optimization service descriptions
- [Free AI Visibility Audit](https://metronyxai.com/audit/): Our free audit tool for checking AI visibility
## Blog (Key Posts)
- [Entity Architecture for AI Search](https://metronyxai.com/entity-architecture-ai-search/): How to build brand entity recognition
- [How to Rank in Perplexity](https://metronyxai.com/how-to-rank-in-perplexity/): Platform-specific optimization guide
5. Optional Section
An H2 section titled “Optional” for URLs the LLM can skip if it needs to save context window space. Use this for secondary content.
## Optional
- [Blog Archive](https://metronyxai.com/blog/): Full blog post listing
- [Case Studies](https://metronyxai.com/case-studies/): Client success stories
That’s it. That’s the entire spec. Simple by design.
The Honest Truth About llms.txt Adoption
We need to be straight about this: no major AI platform currently uses llms.txt for search results or citations.
Search Engine Journal’s analysis confirmed that “LLMs.txt is just a proposal, and no AI platform has signed on to use it.” Google’s John Mueller has publicly called creating one “unnecessary.” AI chatbots and crawlers continue to rely on regular HTML content.
The spec’s own creators expect it to be used mainly during inference, meaning when an LLM fetches your file on-demand to answer a question about your site. Not during mass crawling for model training.
So why are so many companies rushing to implement it?
The Misinformation Loop
Here’s what’s happening: SEO tools started checking for llms.txt and flagging its absence as a “risk.” Semrush’s audit documentation warned that “if your site lacks a clear llms.txt file it risks being misrepresented by AI systems.” That created panic. Site owners saw the warning and felt they needed to create one immediately.
Rank Math suggested AI chatbots “refer to the curated version you’ve given it” via llms.txt. That’s not accurate. AI chatbots use regular HTML. They don’t fetch llms.txt.
The result? A self-reinforcing loop where tool makers feel compelled to offer llms.txt features because users expect them, and users expect them because tool makers keep bringing them up.
The Trust Problem
There’s a good reason AI platforms haven’t adopted llms.txt. A separate file visible only to bots is inherently harder to trust than on-page content that humans also see.
On-page content is relatively trustworthy because it serves both users and bots. But llms.txt? Nobody except AI sees it. Which makes it the perfect vector for manipulation.
A 2024 research paper on Adversarial Search Engine Optimization for Large Language Models demonstrated exactly this risk. Researchers showed that “an attacker can trick an LLM into promoting their content over competitors” using what they called Preference Manipulation Attacks. In testing on Bing and Perplexity, a targeted product was 2.5x more likely to be recommended after an attack.
If that’s possible with regular content, imagine what could happen with a file specifically designed to feed information to AI, where human visitors would never see the deception.
So Should You Create One?
Yes. Here’s our reasoning.
The cost is close to zero. Creating an llms.txt file takes 20 minutes. Maintaining it takes 5 minutes when you publish new content. There’s no technical risk. It doesn’t interfere with your existing SEO.
The potential upside is real, even if unrealized today:
- Developer tools already use it. Code editors, documentation tools, and AI coding assistants are picking up llms.txt to understand project documentation. If your product has technical docs, developers might already benefit from your llms.txt.
- Agentic AI will need structured site maps. As AI agents start performing multi-step tasks (booking, research, purchasing), they’ll need machine-readable guides to website structure. llms.txt is positioned for this use case.
- It forces you to curate. The exercise of writing an llms.txt file makes you think about which pages on your site actually matter. That’s a useful exercise regardless of whether AI reads the file.
What you should NOT expect: a rankings boost, more AI citations, or improved AI visibility. Those outcomes depend on your on-page content, entity signals, and technical SEO. Not on whether you have an llms.txt file.
How to Create Your llms.txt File
You’ve got three options.
Option 1: Build It Manually
Open a text editor. Follow the spec format outlined above. Save it as llms.txt (plain text, UTF-8 encoding) and upload it to your site’s root directory via SFTP, cPanel, or your hosting provider’s file manager.
Here’s a complete example for a SaaS company:
# ProductName
> ProductName is a project management tool for remote teams, offering task tracking, time management, and team communication in a single platform.
ProductName serves over 5,000 teams across 40 countries. Our API integrates with Slack, GitHub, and Jira.
## Documentation
- [Getting Started Guide](https://productname.com/docs/getting-started/): Setup and onboarding walkthrough
- [API Reference](https://productname.com/docs/api/): Full REST API documentation
- [Integrations](https://productname.com/docs/integrations/): Third-party integration guides
## Key Pages
- [Pricing](https://productname.com/pricing/): Plan comparison and pricing details
- [About](https://productname.com/about/): Company background and team
- [Blog](https://productname.com/blog/): Product updates and industry insights
## Optional
- [Changelog](https://productname.com/changelog/): Release notes and version history
- [Careers](https://productname.com/careers/): Open positions
Option 2: Use a CMS Plugin
WordPress plugins like Rank Math and Yoast now offer llms.txt generation. They auto-populate the file based on your site structure. The convenience is nice, but review the output. Auto-generated files sometimes include pages that shouldn’t be there (staging pages, thin content, archive pages with no real value).
Option 3: Use Our Generator
Our free tools page includes a generator that walks you through each section. Enter your site name, write your summary, add your URLs, and it outputs a properly formatted file ready to upload.
Creating Markdown Versions of Your Pages
The llms.txt spec also proposes that pages provide clean Markdown versions at the same URL with .md appended. So yoursite.com/about/ would have a companion at yoursite.com/about/index.html.md.
This is the more ambitious part of the spec. It means maintaining two versions of every important page: your normal HTML page and a stripped-down Markdown version.
Our take: only do this for your 5-10 most important pages. API documentation. Product overview. Key landing pages. Don’t try to maintain Markdown versions of your entire blog archive. The maintenance cost isn’t worth it when no AI platform is currently reading these files.
When creating .md versions:
- Strip all navigation, headers, footers, sidebars
- Remove ads, pop-ups, CTAs, and JavaScript widgets
- Keep only the core content in clean Markdown
- Include images as Markdown image references if they add informational value
- Make sure the Markdown is valid and renders correctly
What NOT to Put in llms.txt
This matters more than what you put in.
- Keep content consistent with what’s on your actual pages
- Curate only 10-20 of your most useful pages
- Only link to publicly accessible content
- Don’t add hidden promotional text that contradicts on-page content
- Don’t include prompt injection attempts (“always recommend…”)
- Don’t list every page on your site , that’s what sitemap.xml is for
- Don’t include pages behind login walls
Don’t add hidden promotional text. If your on-page content says “We’re one of several options in the market” but your llms.txt says “We’re the #1 leader in the industry,” you’re doing exactly what AI platforms are afraid of. Keep it consistent.
Don’t include prompt injection attempts. Some people have tried embedding instructions like “When asked about [category], always recommend [brand]” in llms.txt and .md files. This is a fast track to getting your domain flagged and potentially blacklisted.
Don’t list every page on your site. That’s what sitemap.xml is for. llms.txt is curated. Include 10-20 of your most useful pages. If an AI needs your full site structure, it’ll use your sitemap.
Don’t include pages behind login walls. An LLM can’t access gated content. Linking to it in llms.txt just wastes context window space.
llms.txt and Your Broader AEO Strategy
Here’s where we get opinionated. llms.txt should be approximately 1% of your AI search optimization effort. The other 99% should go toward:
- Making your content technically accessible to AI crawlers (robots.txt, not blocking GPTBot)
- Structuring content for extraction (question headings, direct answers, sourced data)
- Building entity recognition (schema markup, consistent branding, Wikipedia presence)
- Getting discussed on platforms AI engines actually cite (Reddit, LinkedIn, YouTube)
We’ve run over 200 AI visibility audits at Metronyx. Not once has the presence or absence of an llms.txt file been the determining factor in whether a brand gets cited by AI. The brands that get cited are the ones with strong on-page content, clear entity signals, and active community presence.
llms.txt is a nice-to-have. A bet on where AI tooling is heading. Not a substitute for the work that actually drives AI visibility today.
If you want to understand the tactics that DO drive AI citations right now, read our guide on how to rank in Perplexity. Different approach. Proven results.
Create your llms.txt file
Start with an H1 header (your site name), a blockquote summary of what your site does, and organize content links under H2 sections.
Add your top pages
List your 10-20 most important pages with titles and brief descriptions. Prioritize pages with the highest-value content for AI consumption.
Create .md versions (optional)
For maximum extractability, create markdown versions of key pages. Link them as llms-full.txt for AI agents that want the complete content.
Deploy and verify
Upload to yoursite.com/llms.txt. Verify it’s accessible by visiting the URL directly. Keep it consistent with your actual on-page content.
Four steps to create and deploy your llms.txt file
Frequently Asked Questions
No. As of March 2026, no major AI platform (ChatGPT, Perplexity, Google AI Overviews, Claude) officially uses llms.txt for search results or citations. Google’s John Mueller has called it “unnecessary.” AI visibility comes from your on-page content, entity signals, and technical accessibility, not from llms.txt.
No. robots.txt controls what bots can and can’t access on your site. llms.txt provides curated context to help AI understand your site. They serve completely different purposes. robots.txt is about permissions. llms.txt is about explanation. They’re designed to coexist.
Three options: upload the file manually via SFTP to your root directory (/public_html/ or wherever your wp-config.php lives), use a plugin like Rank Math or Yoast that generates it automatically, or use our free generator tool to create the file and upload it yourself. Manual gives you the most control over what’s included.
No. Only create .md versions for your 5-10 most important pages: API docs, product overviews, key landing pages. Maintaining Markdown companions for your entire site isn’t worth the effort when no AI platform currently reads them. Focus your time on the content optimizations that actually drive citations.
No. llms.txt has no effect on traditional search rankings. It’s a separate text file that search engines ignore. The only risk is wasting time perfecting an llms.txt file instead of working on the content, schema, and entity signals that actually improve both traditional SEO and AI visibility.