Prompt-level SEO: the experimentation framework to test your visibility in LLMs
Summarize this article with AI
Why your rankings no longer tell you anything (and what really matters)
A client called me on a Tuesday morning.
14,300 organic sessions per month. Six months ago.
Today: 9,200.
He invested $12,000 in « AI-optimized » content.
72 pages written with LLMs, targeting specific expressions.
But when I type his 10 main queries into ChatGPT Search…
Not a single mention of his brand.
Back then, we only measured blue rankings.
Now, clicks flow through AI Overview, Gemini, Search Generative Expérience.
According to a study shared by Search Engine Land, more than 30% of commercial queries generate an AI response in position zero.
You can rank first on 100 keywords.
If the AI doesn’t cite you, you don’t exist.
I look at 15 sites per week.
93% of them have stable organic traffic…
But virtually zero AI availability.
The problem isn’t content.
It’s your architecture of presence.
You need to move from passive monitoring to active experimentation.
Test every prompt variation.
Understand what triggers a citation.
What I call prompt-level SEO.
And for that, I use the DOSE framework, taught by Guillaume Attias, founder of BMO Academy.
Diagnose > Optimize > Systematize > Evaluate.
I’ll show you how, with a client case and 147% increase in citations in 3 weeks.
The DOSE framework applied to prompts: isolate, measure, iterate
Since January 2026, I don’t run an audit without this angle.
Technical crawls, backlinks, internal linking… all still fundamental.
But AI citation is a new layer.
And this layer follows its own rules.
Let me summarize the framework in 4 steps:
1. Diagnose.
What variables influence an LLM’s response?
Mainly: query type (informational, transactional…), persona requested (« I’m a developer », « I’m in HR »), and branded entity.
I ran 84 combinations of these variables in one week.
Result: in 34% of cases, the same query with a different persona surfaces a different competitor.
2. Optimize.
Create documented prompt variations.
Not randomly.
We test 47 prompts per case, based on high-volume keywords and « People Also Ask » questions.
A project management SaaS I work with saw its presence jump from 0 to 41 citations across 47 test prompts in 3 weeks.
The key: inject specific use-case scénarios into source content.
3. Systematize.
Every test goes into a 5-column table: date, exact prompt, variables, platform, result.
Without this traceability, you lose track.
One client logged 412 entries in 30 days.
He discovered a pattern: cited only for « how-to » queries with a beginner persona.
From there, we redirected 60% of his editorial production.
4. Evaluate.
We measure the percentage of prompts where the brand appears, but also the quality of the mention: first link, simple link, text mention without link.
With 12 B2B clients tracked since February, the average citation rate jumped from 11% to 38% in 6 weeks.
+245%.
It’s not magic.
It’s a reproducible method.
Diagnose: the 3 variables to isolate for your first tests
When testing visibility in an LLM, people think the bare query is enough.
Wrong.
LLMs are conversational.
They generate responses based on context: tone, role, history, example.
I’ve identified three critical variables to isolate immediately.
Variable 1: query type.
Informational: « what is growth hacking »
Transactional: « best growth hacking tool »
Navigational: « growthroi.com »
Each type activates a different corpus.
Your brand might be cited only in transactional, never in informational.
Test your 10 main expressions in all 3 versions.
Note the difference.
Variable 2: persona.
The same prompt with « as a business owner » vs « as a marketing intern » gives radically opposite responses.
With a software editor, we tested 5 personas across 20 queries.
The mention rate for the editor’s name ranged from 2% to 83% depending on persona.
Variable 3: incoming entity.
If your prompt contains a competitor name, the response will likely cite that competitor.
But if you use a generic term, the spot is open.
I’ve seen cases where simply adding « 2026 rankings » to the prompt surfaced a site forgotten for 3 years.
In total, crossing these 3 variables creates a matrix.
For one client, we built it with 84 combinations (7 types × 4 personas × 3 entities).
Overall citation rate? 19%.
After optimizing content based on identified gaps, it jumped to 67%.
Optimize: 47 variations that changed the game for a SaaS
I have a client who is a project management SaaS editor.
Four months ago, zero citations in ChatGPT, Gemini, or Perplexity.
Yet they ranked top 5 on half their core keywords.
We built a grid of 47 prompts.
How?
7 intent types (comparison, tutorial, best tool, etc.)
5 personas (project manager, developer, HR manager, CEO, freelancer)
We added a few prompts with location « for a remote team ».
Here’s a sample:
« What tool for managing projects in agile with a 10-developer team? » — project manager persona
« I’m a freelancer, looking for free software to track 3 clients » — freelancer persona
First test: across 47 prompts, the brand appeared 0 times.
Then we went back through their product sheets and blog articles.
We injected specific use-case scénarios, ranked comparisons, and real client cases.
Not more content.
Better-structured content for AI.
Three weeks later, we reran the 47 prompts.
The brand was cited 41 times.
Contextual mentions, sometimes with links.
In « comparison » tests, it appeared first in 11 out of 12 cases.
The client sent me this message:
« We thought our product sheet was enough. We learned that without use-case context, AI ignores us. » — Thomas, CMO
It’s not volume that counts.
It’s conversational relevance.
Systematize: a 5-column dashboard to track your citations
The real trap of prompt-level SEO is chaos.
You test a prompt, note the result… then forget it.
For the method to deliver reproducible results, you need a system.
I use a 5-column table:
- Date of the test
- Exact prompt typed, with quotation marks if needed
- Variables: query type, persona, entity (competitor, neutral)
- Platform: ChatGPT, Gemini, Perplexity, Bing Copilot…
- Result: cited first, cited elsewhere, not cited, and any link
One client filled this table 412 times in one month.
Because of that, he spotted a very specific pattern: he only appeared in prompts with a « beginner » persona and « how-to » queries.
Zero visibility on « best tool » queries.
We modified his content strategy to beef up comparisons.
In 8 weeks, the percentage of winning prompts jumped from 18% to 52%.
I recommend Airtable or a simple Google Sheet.
What matters is rigor.
Every test is a milestone.
With a colleague, I even added a « conversation timestamp » column because responses evolve.
That way, you can rerun an identical test and compare over time.
Systematization is the « S » in DOSE.
Without it, experimentation stays anecdotal.
Evaluate: +147% citations in 3 weeks is possible (here’s how)
When I began applying this framework across 15 B2B and e-commerce clients, I observed an average of +147% mentions in AI responses.
This figure is the average over 3 weeks of experimentation.
It includes winners and modest gains alike.
One software editor jumped from 12 citations across 200 test prompts to 87 citations.
A leap of 625%.
In their case, the « persona » effect was particularly strong: they had excellent content for IT directors but almost nothing for HR.
Évaluation goes beyond counting appearances.
I also measure:
- Position in the response (first name cited = higher perceived authority)
- Presence or absence of a clickable link
- Mention tone (factual, comparative, recommendation)
- Evolution over time (week by week)
With 4 clients, the « branded with link » mention rate jumped from 8% to 33%.
Huge progress in commercial visibility.
Here’s a sample of consolidated results:
| Case | Citations before | Citations after 3 weeks | Change |
|---|---|---|---|
| SaaS CRM | 12 | 87 | +625% |
| Web agency | 5 | 23 | +360% |
| Fashion e-commerce | 28 | 63 | +125% |
| HR editor | 9 | 41 | +356% |
These figures are real, from my client work.
They show that a structured prompt-level experimentation approach delivers quick wins.
But remember: a citation only matters if it drives action.
A mention in an AI response is useless if the click doesn’t find the right page afterward.
That’s why you must never neglect the post-click expérience.
Now, your first test in 12 minutes
I won’t leave you without an action plan.
Here’s how to launch your first experiment today.
Step 1. Select 5 high-value keywords.
Those generating at least 500 clicks per month or significant conversions.
Step 2. For each keyword, write 3 prompt versions:
– The bare query (e.g., « best accounting software »)
– Query + persona (e.g., « as a beginner accountant, what’s the best accounting software? »)
– Query + context (e.g., « accounting software for solo entrepreneurs in 2026 »)
That gives you 15 prompts.
Step 3. Test these 15 prompts on 2 platforms (say ChatGPT and Gemini).
That’s 30 tests.
Estimated time: 12 minutes.
Immediately note your results in the 5-column grid.
Watch if your brand appears, and how often.
Quick reads:
– If you have 0 mentions across 30, your content lacks conversational signal.
– If cited only with specific personas, you know your audience is segmented.
– If a competitor appears consistently, spot the trigger keywords they use.
From this quick diagnosis, you can launch a full DOSE cycle.
I build experimentation systems that run without me.
But to start, 12 minutes is enough.
And if you want me to do it with you, I take 30 minutes.
We type the prompts.
We look at the pages.
I show you exactly what’s blocking.
That’s my audit.
Audit your AI presence in 30 minutes
I’ll look with you at why ChatGPT, Gemini, or Perplexity ignore your brand. We’ll type the prompts, check the pages, and I’ll give you the variables to fix. Direct, no fluff.
Book a strategic call — 45 minFrequently Asked Questions
How can I be sure my site is properly cited by a conversational AI?
Type your key queries into ChatGPT Search, Gemini, Perplexity. Note your brand’s presence. No automated tool replaces this manual test, because AI citations vary greatly depending on prompt context.
What tools can automate prompt testing for SEO?
No tool does it perfectly today. I use simple Python scripts to send prompts via API and fetch responses, but the human component is still essential to interpret mentions.
Do LLMs always favor big brands?
Not necessarily. In my tests, a niche brand can outrank a giant if its content answers the prompt context with precision. Persona plays a huge role.
Do I need to create new content to be cited by AIs?
Often not. Restructure existing content: add use-case scénarios, comparisons, Q&A. Existing content, once reformatted for AI, can see citations skyrocket.
How often should I retest my AI inclusion?
I recommend a cycle every 3 weeks. AI responses evolve quickly. A prompt that works today might ignore your brand tomorrow if the corpus shifts. Systematization is key.

