Prompt research: the missing link in your GEO strategy

Summarize this article with AI

ChatGPT Perplexity Claude Gemini Grok Copilot

In short: In brief: GEO tools on the market track simulated prompts that don’t reflect how humans actually talk to AI. SEO had keyword research. GEO doesn’t have its equivalent yet. Without an inventory of real user prompts, you’re optimizing for ghost traffic.

~80%of tracked prompts are simulated (field observation)

12 wordsaverage length of a real prompt vs 3-4 for a test prompt

0GEO tools that crawl real prompts to date

What I see with my GEO clients

A client calls me last week. B2B e-commerce. 1,200 SKUs. He’s been tracking 47 prompts in his GEO tool for 5 months.

Zero citations.

I look at the tracked prompts. All follow the same pattern:

« best CRM for SMEs »
« free accounting software »
« ERP for manufacturing France »

Short prompts. « SEO » prompts. Prompts that look like Google queries from 2018.

I ask: « Do you have a database of real user prompts? »

Silence.

The problem isn’t the tool. It’s the inventory. You’re tracking prompts nobody types.

According to a recent thread on r/TechSEO, most GEO tools follow curated prompt sets — not real prompts. The example given: a tool tracks « loft studio Düsseldorf ». But a real AI user asks: « Where can I find a loft photo studio near Düsseldorf for business portraits? »

12 words versus 3. Conversational context. Precise intent.

The delta is brutal.

SEO had keyword research. GEO has nothing.

In SEO, we spent 20 years building a process:

Inventory of real search queries (Google Search Console, Keyword Planner, SEMrush, Ahrefs)
Search volume per query
Seasonality, trends, competition
Semantic clustering
Prioritization

You didn’t launch a content silo without a spreadsheet of 500+ real sourced queries.

In GEO, as of today, this process doesn’t exist.

Current tools (BrightEdge, Botify, seoClarity for those that launched GEO modules) do two things:

They simulate typical prompts based on your SEO keywords
They track whether your brand appears in the responses

But nobody crawls real prompts. Nobody has access to LLM user logs (Google SGE shares nothing, OpenAI doesn’t, Perplexity keeps everything).

Result: you’re optimizing for a parallel universe.

You build content for prompts that should be typed. Not for ones that are typed.

How humans actually talk to AI

I can’t access your clients’ ChatGPT logs. But I observe three sources:

Questions people ask me in audits (« Stéphane, how do I… »)
Perplexity Pro session transcripts shared publicly (Reddit, LinkedIn)
Cases reported by Guillaume Attias and other GEO practitioners

Patterns observed:

Simulated SEO prompt	Real observed prompt
« SEO agency Paris »	« I’m looking for someone who can audit a Shopify site with 800 products and tell me why we’re stuck at 3,000 sessions a month for the past year »
« best CRM »	« What’s the best CRM for a 12-person team that needs native Slack integration and doesn’t cost more than $80/user/month? »
« photo studio location Düsseldorf »	« Where can I find a loft photo studio near Düsseldorf for business portraits? »

Three structural differences:

1. Length
Real prompts run 10 to 25 words. SEO prompts run 2 to 4 words.

2. Situational context
People give constraints (budget, timing, geography, team size). They’re not looking for « the best ». They’re looking for « the best for me ».

3. Hybrid language
Even in France, many complex prompts are in English. AI responds better. Power users know this.

If your tracked prompts don’t reflect these three realities, you’re tracking ghosts.

Why GEO tools miss the mark

I’m not criticizing the product teams at BrightEdge or Botify. They have a source data problem.

In SEO, Google gave us Search Console. Imperfect, but real. You had 90% « not provided », but the remaining 10% was enough to model the semantic universe.

In GEO, you have nothing.

OpenAI doesn’t publish a Search Console for ChatGPT
Google SGE surfaces no user data (even anonymized)
Perplexity keeps everything in-house
Claude too

GEO tools are forced to simulate.

They take your SEO keywords. They transform them into pseudo-prompts (« Find me the best [keyword] »). They track whether your site shows up.

It’s better than nothing. But it’s not a real inventory.

Order of magnitude I observe with my clients: ~80% of tracked prompts are simulated. ~20% come from internal brainstorms (« what could a customer ask? »).

Zero prompts crawled from a real log.

You’d never do SEO in 2025 by guessing searches. Why accept that in GEO?

Building a prompt inventory without an API

No API. No GEO Search Console. No user logs.

But you’re not blocked.

Here’s what I do with my clients starting 6 months ago:

1. User interviews (especially B2B)
If you sell B2B, your sales reps talk to 10-15 prospects per week. Each call has 3 to 5 typical questions. Record (with consent). Transcribe. You have 50 real prompts in one month.

2. Customer support
Intercom tickets, Zendesk, emails saying « how do I… » are naturally-phrased prompts. A customer writing « I can’t sync my Shopify with your tool, I’m getting a 403 API error » will type the same thing into ChatGPT before contacting you.

3. Reddit, Quora, niche forums
Public questions asked in 2024-2025 are real prompts. r/TechSEO thread: « How do I audit a site with 10k pages without spending $500/month? » That’s a real prompt.

4. Your own AI sessions
You use ChatGPT, Claude, Perplexity for your job. Keep a doc with your last 20 complex prompts. Rephrase them as if you were your own customer. You have a base.

5. Google People Also Ask (SEO-to-GEO bridge)
PAA reflects real questions. They’re longer than queries. They’re conversational. It’s a bridge between SEO and GEO.

One e-commerce client (office furniture, 940 SKUs) collected 130 real prompts in 8 weeks with this method. Zero paid tools. Just manual work.

Result: he now tracks prompts like « What’s the best ergonomic office chair under €400 that supports 120kg and ships to Belgium? »

Not « ergonomic office chair ».

Tracking citations isn’t enough

Most current GEO tools track one metric: are you cited?

That’s an output metric. Not an input metric.

In SEO, you don’t just track your ranking. You also track:

Search volume per query
Trends (rising? declining?)
Seasonality
Emerging queries (« queries with significant change »)

All that lets you prioritize.

In GEO, if you only track « am I cited on this prompt? » you have no visibility into:

How many times that prompt is actually typed
If that prompt is growing or shrinking
Which adjacent prompts are emerging

You’re flying blind.

One SaaS client (project management tool) tracked 60 prompts. 12 citations. Citation rate: 20%.

We rebuilt the inventory using interviews + support. We found 40 new prompts never tracked.

Result after 3 months: 28 citations across 100 prompts. Rate: 28%. But more: those 28 citations were on prompts generating demos. Not cold traffic.

The metric that matters isn’t « how many citations ». It’s « how many citations on high-commercial-intent prompts ».

And to know that, you need an inventory.

What the GEO market is missing today

In my view, the GEO market in 2025 looks like the SEO market in 2008.

We had ranking tools (Ranks.fr, Advanced Web Ranking). We had crawl tools (Screaming Frog was being born). But we didn’t yet have:

A widespread Google Search Console
A Keyword Planner
An Ahrefs to map the semantic universe

GEO has its tracking tools (BrightEdge, seoClarity, Botify). But it doesn’t yet have:

A database of crawled real prompts at scale
The equivalent of a Search Console for ChatGPT, Perplexity, Claude. It doesn’t exist. It may never exist (privacy, closed LLM business models).
A prompt clustering tool
In SEO, you cluster queries by intent, by similar SERP, by entities. In GEO, you should cluster prompts by expected response structure, by complexity level, by intent (informational, transactional, navigational).
An estimated volume per prompt
Even approximate. Even « low / medium / high ». Something that lets you say « this prompt is typed 10× more than that one ».
A tool for discovering adjacent prompts
In SEO, Google Suggest and PAA give you nearby queries. In GEO, nothing. You’re on your own.

Whoever builds the first prompt research tool powered by real user data (even partial, even anonymized, even opt-in) wins the market.

Until then, you do it by hand.

What you can do this week

You’re not going to wait for the perfect tool.

Immediate checklist:

Monday – Internal inventory
Ask your support, sales, and customer success teams to note 10 client questions this week. Raw format. No rephrasing.

Tuesday – Review current tools
List the prompts you track today. For each one, ask: « Have I ever heard a human phrase it exactly like this? » If not, drop it.

Wednesday – Public sources
Reddit, Quora, niche forums. 30 minutes. Copy 10 long questions (10+ words). Those are prompts.

Thursday – Internal user test
Take 3 colleagues (not SEO, not marketing). Give them a fictional problem related to your product. Ask them to query ChatGPT to solve it. Note their exact prompts.

Friday – Consolidation
You now have 50-80 real prompts. Sort by intent (informational, comparison, purchase). Track the high-commercial-intent ones this week. Drop the rest for now.

You just built a prompt research inventory in 5 days.

No tool. No budget. Just observation.

It’s imperfect. But it’s real.

And real beats simulated. Always.

GEO audit: we start with the inventory

I’m not selling you a citation dashboard. We build together a real inventory of prompts sourced from your actual client conversations, then we track what generates business—not noise.

Book a strategic call — 45 min

Frequently Asked Questions

Does prompt research replace keyword research?

No. The two coexist. Keyword research powers your classic SEO (Google Search). Prompt research powers your GEO strategy (ChatGPT, Perplexity, Claude). A well-architected site serves both.

How do you know if a prompt is actually typed by users?

Without an API, you can’t have absolute certainty. But you can source your prompts from real conversations (support, sales, public forums). That’s infinitely more reliable than simulating prompts from SEO keywords.

Are current GEO tools useless?

No. They track citations, which is useful. But they don’t solve the upstream problem: which prompts to track? Without a real inventory, you’re optimizing for ghost traffic.

How many prompts should I track to start?

Start with 30-50 high-commercial-intent prompts. Not 300 generic ones. Better to track 30 real prompts than 300 simulated ones.

Can you automate real prompt collection?

Partially. You can scrape Reddit, Quora, forums. You can parse your support tickets with an LLM to extract questions. But final curation stays manual: you need to validate that the prompt reflects real intent.

Stéphane Jambu

SEO & AI Engineer

I build growth systems / AI / Neuroscience | 650+ clients · 80 LinkedIn testimonials · 30 years of expertise · 15 years of systems running without me.

Follow on LinkedIn