How to Measure AI Citations: SOMV, Mystery Shopping, and the Free Tool Stack (2026)

TL;DR

AI citation measurement is sampling-based — track trends, not absolutes. The free stack: (1) monthly mystery shopping — 20–30 prompts across ChatGPT, Perplexity, Gemini, Copilot, Claude, each sampled 3× because outputs vary; (2) Bing Webmaster Tools AI Performance Report — the only free official URL-level citation tracker; (3) a GA4 segment for AI referral traffic, which converts at ~4.4× the organic rate; (4) Google Alerts for brand mentions, the leading indicator citations follow by 30–60 days. Headline KPI: Share of Model Voice — the % of AI answers in your category that mention you.

There is no Search Console for AI engines — no impressions report, no citation counts, no official dashboard covering ChatGPT, Perplexity, and Gemini. Measurement is sampling-based, which means you track trends, not absolutes. The good news: a free, repeatable stack covers every measurable surface, and the discipline of running it monthly is what separates teams that know their GEO is working from teams that guess.

Here is the stack we run, with the exact targets and templates.

Share of Model Voice (SOMV) is the percentage of AI answers in your category that mention your brand — the GEO equivalent of search ranking, adapted for a world where there are no fixed positions. You cannot rank #1 in ChatGPT; the same prompt produces different answers on different runs. What you can measure is how often you appear across a representative sample.

Run a fixed prompt set monthly, count the answers that mention you, divide by total samples. A business mentioned in 12 of 90 samples has a 13% SOMV. Whether that is good depends on one thing: last month's number.

How does monthly mystery shopping work?

Build a list of 20–30 prompts your customers would actually ask, run each across five engines, three times each, and log every result. The prompt list structure we use:

10 prompts about your core products or services ("best bilingual website builder for small businesses in Southern California")
5 about your industry broadly
5 about competitors ("X vs Y — which is better for...")
5 brand queries ("What is [brand]?", "Is [brand] legit?")

Word prompts conversationally — around 23 words, the way people actually talk to AI — not as keyword strings. For each sample log: engine, mentioned yes or no, linked yes or no, sentiment, what was actually said, and who else got cited. The competitors cited when you are absent are your gap analysis: they tell you exactly which content to build next.

Three samples per prompt is non-negotiable. AI outputs are stochastic — a single run tells you almost nothing.

What does Bing's AI Performance Report add?

The only free, official, URL-level AI citation data in existence — covering Copilot and Bing AI summaries. Launched in public preview in February 2026 inside Bing Webmaster Tools, it shows which of your URLs appear in generative answers and how citation activity trends over time.

Its scope is the Microsoft surface only, but that surface matters more than its search share suggests: Bing's index feeds parts of ChatGPT's browsing, and the January 2026 Windows 11 update made Copilot the default search handler on every Windows machine. Verify your site in Bing Webmaster Tools (free, ten minutes) and treat this report as your baseline truth for one engine while sampling covers the rest.

How do you track AI traffic and conversions in GA4?

Create a segment for sessions whose source matches AI referrers — chatgpt.com, perplexity.ai, gemini.google.com, copilot.microsoft.com, claude.ai — and compare its conversion rate to organic. The benchmark worth knowing: AI-referred visitors convert at roughly 4.4× the organic rate (Semrush). Volumes are small for most sites — under 1% of traffic — but the intent quality is consistently higher.

One honest caveat: this undercounts. Agentic browsers like ChatGPT Atlas and Perplexity Comet send regular Chrome signatures, so their sessions appear as normal direct or organic traffic. Treat the GA4 segment as the measurable floor, not the full picture.

Why track brand mentions as a leading indicator?

Because citations follow mentions by 30–60 days — mentions are the input, citations are the output. Unlinked brand mentions correlate roughly 3× more strongly with AI visibility than backlinks (Ahrefs, 75,000-brand analysis). Set up free Google Alerts on your brand name and log monthly mention volume. When mentions grow and citations have not moved yet, the pipeline is working — give it the lag time before changing strategy.

What goes in the monthly report?

Five numbers, tracked as trends: SOMV overall, citation rate per engine, Bing AI Performance citations, AI-referred sessions and conversions, and brand mention volume. Plus two lists: this month's citation wins (prompt, engine, what the AI said) and the top five gaps (prompts where competitors got cited and you did not — next month's content targets).

The first report is the baseline and will look humble. That is normal: GEO signals compound over months 2–6, and the trend line — not any single month — is the deliverable.

You cannot manage what you sample carelessly. Fix the prompt list, sample 3×, log everything, and read trends monthly. The teams that win at GEO are rarely the ones with secret techniques — they are the ones who measured consistently enough to know which ordinary techniques were working.

Frequently asked questions

What is Share of Model Voice (SOMV) and how is it calculated?

SOMV is the percentage of AI-generated answers in your category that mention or cite your brand. Calculate it by running a fixed set of representative prompts monthly across the major engines and dividing the answers that mention you by total answers sampled. Track the trend month over month — the absolute number matters less than the direction.

Why do I need to run the same AI prompt three times when measuring?

Because AI outputs are stochastic — the same question can cite different sources each run. A single test gives a misleading picture: you might appear in one generation and not the next. Three samples per prompt per engine is the minimum for a stable monthly signal.

Is there a free tool that shows official AI citation data?

One: the AI Performance Report in Bing Webmaster Tools (public preview since February 2026). It shows which of your URLs are cited in Copilot and Bing AI summaries and how citation activity trends. It only covers the Microsoft surface, but it is the closest thing to first-party AI citation data that exists.

How do I see AI traffic in Google Analytics 4?

Build a custom segment filtering session source for AI referrers: chatgpt.com, perplexity.ai, gemini.google.com, copilot.microsoft.com, claude.ai. Compare its conversion rate against organic — AI-referred visitors convert at roughly 4.4× the organic rate (Semrush). Caveat: agent-mode browsers appear as regular Chrome traffic and cannot be segmented.

What results should I expect in the first 90 days of GEO work?

Months 1–2: little citation movement — engines need to re-crawl. Month 2–3: first scattered mentions on some prompts, inconsistent between samples. Month 3+: a measurable trend if the work is landing. Expect volatility throughout; engines change citation behavior without notice.

Generative Engine Optimization (GEO): The Complete Guide for 2026

GEO is the practice of getting your brand cited in AI answers from ChatGPT, Perplexity, and Gemini. The techniques with real evidence behind them, how to measure results, and how long it takes — from the team that runs them.

How to Get Your Content Cited by ChatGPT: 7 Techniques That Work in 2026

ChatGPT cites sources that lead with the answer, back claims with named statistics, and structure every section as a standalone passage. The 7 evidence-backed techniques, with the exact targets we use.

Schema Markup for AI Search: What Actually Helps in 2026 (And What Is Hype)

No special schema makes AI engines cite you — but Organization, Article, Person, and FAQPage markup builds the entity recognition that citation depends on. The honest guide: what to implement, what to skip, and the JSON-LD to copy.

How to Measure AI Citations: SOMV, Mystery Shopping, and the Free Tool Stack (2026)

How does monthly mystery shopping work?

What does Bing's AI Performance Report add?

How do you track AI traffic and conversions in GA4?

Why track brand mentions as a leading indicator?

What goes in the monthly report?

Frequently asked questions

What is Share of Model Voice (SOMV) and how is it calculated?

Why do I need to run the same AI prompt three times when measuring?

Is there a free tool that shows official AI citation data?

How do I see AI traffic in Google Analytics 4?

What results should I expect in the first 90 days of GEO work?

Related articles

Ready to build your website?

How to Measure AI Citations: SOMV, Mystery Shopping, and the Free Tool Stack (2026)

What is Share of Model Voice and why is it the headline KPI?

How does monthly mystery shopping work?

What does Bing's AI Performance Report add?

How do you track AI traffic and conversions in GA4?

Why track brand mentions as a leading indicator?

What goes in the monthly report?

Frequently asked questions

What is Share of Model Voice (SOMV) and how is it calculated?

Why do I need to run the same AI prompt three times when measuring?

Is there a free tool that shows official AI citation data?

How do I see AI traffic in Google Analytics 4?

What results should I expect in the first 90 days of GEO work?

Related articles

Ready to build your website?