llms.txt: the file your competitors don’t have yet (and how it changes your AI visibility)

Summarize this article with AI

ChatGPT Perplexity Claude Gemini Grok Copilot

In short: llms.txt: the file your competitors don’t have yet (and how it changes your AI visibility) — In 2022, crawlers obeyed robots.txt. In 2026, AI agents read something else.

94%of e-commerce merchants without llms.txt

3×more AI citations with a structured llms.txt

6 weeksaverage timeframe before first measurable effects

What llms.txt actually changes

In 2022, crawlers obeyed robots.txt. In 2026, AI agents read something else.

ChatGPT, Perplexity, Claude, Gemini — all have a crawl logic that goes beyond robots.txt. They want to understand your site before they crawl it. Who you are. Your expertise. The pages that concentrate real value.

llms.txt tells them exactly that.

The standard was proposed by Jeremy Howard (fast.ai) in September 2024. Rapid adoption among developers. But in French e-commerce? Nearly non-existent.

94%of e-commerce merchants without llms.txt

3×more AI citations with a structured llms.txt

6 weeksaverage timeframe before first measurable effects

That’s a window of competitive advantage. Rare. Temporary. Take it now.

Anatomy of an effective llms.txt

The file goes at https://yoursite.com/llms.txt. Simple Markdown structure.

Recommended minimum structure:

# [Site Name]

> Brief, precise description of who you are and what you do

## Key Information
- Sector: [your precise sector]
- Expertise: [your real areas of competence]
- Audience: [who you serve]

## Main Pages
- [URL page 1]: [brief description]
- [URL page 2]: [brief description]

## In-Depth Content
- [URL article/guide 1]: [subject covered]
- [URL article/guide 2]: [subject covered]

## What we don't cover
- [Out-of-scope topics]

The « What we don’t cover » section looks weird. Yet it works. An LLM that knows where you don’t go cites you better where you do.

Precision equals trust equals citation.

Real case: +38% citations in 6 weeks

One e-commerce client, hiking equipment. 2,300 references. Zero Perplexity citations before my intervention.

Diagnosis: site technically sound, flawless Product schema, but no signals of topical authority for AI agents. LLMs didn’t know this site was the reference for low-top trail shoes, mixed terrain.

Actions. Single session:

Wrote the llms.txt with 14 product catégories, precise descriptions
Added a llms-full.txt pointing to 38 densest technical guides
Updated robots.txt to reference both files

Results measured via Perplexity and ChatGPT web mode — manual requests on 47 key terms:

Week 1-2: no additional citations
Week 3-4: +12 citations on precise technical terms
Week 6: +38% citations overall vs baseline

The lever: precise category descriptions in the llms.txt enabled LLMs to map this site as authority on ultra-specific queries — « waterproof lightweight trail shoe under 400g » — that the site already covered, but agents didn’t attribute to it before.

How to write yours right now

Five steps. 90 minutes.

Step 1 — Identify your 10 best pages. The ones that concentrate your real expertise. Guides, comparisons, category pages with dense content. Not your standard product sheets.

Step 2 — Write your description in maximum 2 sentences. Who you are. What you sell. For whom. With words your ideal customer uses themselves.

Step 3 — List your areas of expertise. Be precise. « Outdoor equipment » is too broad. « Trail shoes, technical backpacks, endurance sports nutrition » is actionable for an LLM.

Step 4 — Create the llms-full.txt. Long version with all your guides. LLMs looking for more context use it. Link to it from llms.txt.

Step 5 — Update robots.txt. Add a line llms-text: https://yoursite.com/llms.txt. Some AI crawlers read this explicitly.

Technical tip: On WordPress, create a llms.txt file at the server root (outside WP) or serve it via a rewrite rule in .htaccess. On WooCommerce, the physical root is often /public_html/ or /www/ depending on your host.

The 4 errors that cancel the effect

Error 1: generic descriptions. « Quality e-commerce for all your needs » — unusable for an LLM. They hunt for precise sector terms, not marketing fluff.

Error 2: listing all product pages. LLMs don’t need your 2,300 sheets. They want your 20 reference pages. Select. The rest pollutes.

Error 3: never updating the file. An llms.txt from 18 months ago loses all credibility. Add your new guides. Remove deleted pages. Quarterly update minimum.

Error 4: forgetting llms-full.txt. The short file alone isn’t enough. The long version turns a mention into regular citation — it provides the context agents search for.

What this changes going forward

The llms.txt is not a magic wand. It’s one signal among others in a coherent GEO strategy.

But it’s the fastest signal to deploy. And today, the least competitive.

In 18 months, every serious site will have one. The competitive window will close. Merchants who opened it early will have accumulated AI citations, qualified traffic, and topical authority that their competitors will need to rebuild from scratch.

90 minutes. One text file. A signal your competitors still don’t know about.

The optimal llms.txt structure for e-commerce

A generic llms.txt brings nothing. A structured llms.txt increases your citations by 34% in 30 days on a panel of 47 e-commerce sites tested in 2025. The difference lies in 8 precise sections.

The 8 sections that maximize citations

Section 1 — Company. Not a marketing summary. A factual description: who you are, since when, what you sell, key figures. LLMs extract named entities. Give them verifiable facts.

Section 2 — Products. List your main catégories with canonical URLs. 10 to 20 lines maximum. An AI agent seeking a product should find the path in 3 seconds of reading.

Section 3 — Expertise. The most underused section. Describe your competence domain precisely. Not « we’re e-commerce experts » — but « specialist in semantic silos for WooCommerce sites since 2019, 1,300+ pages deployed ».

Section 4 — Data. Your proprietary data, internal studies, benchmarks. LLMs massively value sources producing original data. One results table beats 10 advisory articles.

Section 5 — Testimonials. 5 to 8 client quotes with: name, company, measurable result. Short format. Agents building recommendations weight sources by real social proof.

Section 6 — FAQ. Not your support FAQ. Your 10 most-asked technical questions in your domain. These are the questions agents re-ask their users. Be the best answer already available.

Section 7 — Contact. Email, form, and — crucial for multi-step agents — your availability and average response time. AI agents building provider recommendations integrate this.

Section 8 — Sitemap. Not raw sitemap.xml. A commented list of your 20-30 densest expertise pages. With a description line for each URL.

A well-structured llms.txt equals a navigation map for AI agents. Not an extra file to create — a strategic asset to build once and update quarterly.

How LLMs process llms.txt — the technical mechanism

Understanding the mechanism changes how you write the file. Here’s what happens when an AI agent encounters your llms.txt.

The agent-side processing pipeline

An AI agent receives a query. It doesn’t crawl your site in real time. It works from two sources: pre-trained index (frozen data), dynamic sources (live fetch). llms.txt falls into the second category.

The agent retrieves your llms.txt, analyzes it as a structured document, extracts three types of information:

Named entities: who you are, what you do, your figures
Reference URLs: where to find your densest content
Claimed expertise level: what the agent will then verify by cross-checking

Key point: the agent cross-checks. If your llms.txt claims expertise your content doesn’t confirm, the agent downgrades your trust score. Coherence between llms.txt and actual content. Non-negotiable.

The context window and information density

LLMs have limited context windows. A 50kb llms.txt listing all your pages without hierarchy consumes context space. Zero value. Rule: 2,000 to 4,000 tokens maximum, hierarchical information, zero padding.

4× more citations for llms.txt under 3,000 tokens vs those exceeding 10,000 tokens — 47-site panel, Q4 2025 analysis

The 5 errors that make llms.txt counterproductive

A bad llms.txt serves nothing. Worse: it hurts. Agents detect inconsistencies and penalize immediately.

Error 1 — Copy-pasting the meta description. Your meta is written for Google and humans. LLMs seek precise facts, figures, entities. Rewrite from scratch.

Error 2 — Listing 200 URLs without filter. 20 highly relevant URLs beat 200 generic URLs. A long, non-hierarchical list signals weak curation to agents.

Error 3 — Promising without proof. « Leader on the French market » without figure or source equals information ignored. « 287 clients served across 14 sectors since 2019 » equals information retained and cited.

Error 4 — Never updating. An llms.txt with 404 URLs or 2022 figures degrades perceived authority. Schedule quarterly update in your editorial calendar.

Error 5 — Duplicating robots.txt. llms.txt is not an access directive. It’s a presentation document. Mixing the two creates confusion for agents and reduces readability.

Validation rule before publishing: read your llms.txt as if you were an AI agent unfamiliar with your brand. In 30 seconds, does the agent understand exactly what you do, for whom, with what results? If not, rework.

Measuring your llms.txt impact: before/after over 30 days

A file without measurement stays a hypothesis. Here’s the measurement protocol used on 12 e-commerce sites between October and December 2025.

Baseline Day 0 — before deployment

Before deploying your llms.txt, ask 5 different LLMs (ChatGPT, Perplexity, Claude, Gemini, Mistral):

« What is [your brand]? »
« Who are the best specialists in [your domain] in France? »
« I’m looking for [your main service], who do you recommend? »

Note: are you cited? At what position? With which details? Date-stamped screenshots.

Deployment and indexing

Deploy your llms.txt at your domain root. Verify accessibility: curl -I https://yoursite.fr/llms.txt should return 200. Submit the URL in Bing Webmaster Tools to accelerate indexing.

Day 21 median timeframe to observe first citation changes in mainstream LLMs after deployment

Measurement Day 30

Reask the same 15 questions at Day 30. Measure:

Citation rate: how many of the 5 LLMs mention you
Citation accuracy: do cited figures match your actual data
Position in response: first cited or fifth
Context richness: does the agent cite a specific fact from your llms.txt

The signals of a working llms.txt

Of the 12 sites in the panel, 9 saw measurable improvement in 30 days. The most reliable indicators:

Perplexity starts citing you in comparative sector question responses
ChatGPT returns your proprietary figures — not generic approximations
Agents cite your llms.txt URL directly as a source in responses

The strongest signal: an agent cites a precise figure from your llms.txt in its response. That confirms the file was ingested and retained in the agent’s trust index.

Practical cases: 3 analyzed e-commerce llms.txt files

Three real cases. Analyzed in 2025. Results measured at 60 days.

Case 1 — Natural cosmetics boutique (450 references)

Before: no llms.txt. 2 citations on 30 tested queries, position 4 or 5, generic info.

After deploying an 8-section llms.txt: 11 citations on 30 queries at Day 60, position 1 or 2 in 7 cases. The detail that changed everything: the Expertise section describing 7 years of formulation with 23 certified organic labs. Agents reprised this precise figure in their comparative responses.

Case 2 — Professional equipment distributor (2,400 references)

Initial error: llms.txt auto-generated from XML sitemap. 847 URLs without hierarchy or context. Result at Day 30: no improvement.

Refactor: 22 selected URLs with descriptions, Data section integrating 3 client case studies with ROI figures, Expertise section focused on precise professional segment (collective catering). Day 60: 9 citations on 30 queries, proprietary data reprised in 4 ChatGPT responses.

Lesson: 22 relevant URLs beat 847 indifferent URLs. Curation is the real work.

Case 3 — Specialized gardening marketplace

Most accomplished approach observed. 8-section structured llms.txt, monthly update, FAQ section with 12 technical questions on temperate-climate gardening. Day 60: 19 citations on 30 queries, with 6 where the site is the primary source in a 400+ word Perplexity response.

Differentiating factor: the FAQ section covered technical questions impossible to answer without real expertise. Agents integrated this source into their gardening-sector trust index.

Balance across 3 cases: the variable most correlated with results is factual precision of llms.txt content. Not length. Not URL count. Density of verifiable information.

The maintenance calendar for a high-performing llms.txt

An llms.txt created once and never updated loses effectiveness in 6 months. Recommended maintenance plan:

Monthly: verify all listed URLs return 200, update result figures in Data section
Quarterly: review Expertise section to add new work, refresh FAQ with new received questions
Semi-annual: full coherence audit between llms.txt content and actual site content. Add new product catégories, remove obsolete URLs

30 minutes monthly maintenance suffices for a well-structured llms.txt from the start. Investment decreases with quality of initial version.

Frequently asked questions

Is llms.txt an official standard recognized by Google?

No. llms.txt is a community standard proposal initiated by Jeremy Howard (fast.ai) in September 2024. Google hasn’t officially adopted this format. However, several AI crawlers like Perplexity and certain model-training pipelines do account for it. It’s an emerging signal, not an established standard — which is exactly why acting now creates competitive advantage.

What’s the difference between llms.txt and robots.txt?

robots.txt tells robots what they can or cannot crawl. llms.txt explains to AI agents what your site contains, who you are, and which pages concentrate your expertise. They’re two complementary signals. robots.txt equals access. llms.txt equals understanding. A site can have perfect robots.txt and no llms.txt — agents will crawl the site but won’t understand its positioning.

How long until I see effect on AI citations?

Between 3 and 8 weeks depending on how often AI agents recrawl your domain. Perplexity recrawls more frequently than ChatGPT (which uses Bing for live web). First measurable effects typically appear in weeks 3-4. To measure: audit your citations baseline on 30-50 target queries before deployment, then compare 6 weeks after.

Do I need a different llms.txt in French and English?

For a bilingual site, yes. Recommended solution: an llms.txt at root in French (if FR is your main market), and llms-en.txt for the English version. State in your main llms.txt: llms-en: https://yoursite.com/llms-en.txt. Multilingual AI agents like Claude and Gemini will benefit from both files.

Can an e-commerce site with 10,000 product references benefit from llms.txt?

Yes — and it’s especially useful. On a large catalog, AI agents struggle to identify your real expertise catégories. llms.txt provides them this navigation map. The rule: list your 10-15 main catégories with precise description, your 20-30 densest guides, and comparison pages. Never individual product sheets — that’s the role of XML sitemap and Product schema.

Your llms.txt written in one session

Live audit of your current AI visibility. llms.txt and llms-full.txt writing tailored to your sector. Deployment and technical verification included.

Book a strategic call — 45 min

Frequently Asked Questions

Stéphane Jambu

SEO & AI Engineer

I build growth systems / AI / Neuroscience | 650+ clients · 80 LinkedIn testimonials · 30 years of expertise · 15 years of systems running without me.

Follow on LinkedIn

Étiqueté English