Back to blog
GEOPerplexityAI SEOStructured DataContent Strategy

How to Get Cited by Perplexity: A Practical GEO Guide for 2026

Luis D. González9 min readUpdated

TL;DR

Perplexity runs its own web crawler (PerplexityBot), builds a separate index, and uses Sonar models to retrieve roughly 10 candidate pages per query — then surfaces only 3-4 as cited sources. Pages that win citations answer the specific question in the first 100 words, carry structured data (FAQ/Article schema), show a visible author and date, and come from domains with cross-platform authority signals. Add those four elements to your best pages and you are competing for the slot.

Perplexity reviews roughly 10 candidate pages per query and surfaces only 3-4 as cited sources. The question is not whether your page exists on the web — it is whether it clears every filter in that retrieval and ranking pipeline. Most pages fail at one of four checkpoints: they are not crawled, they do not answer the question quickly enough, they carry no structural signals, or the domain has thin authority. Fix those four and you are competing for the slot.

Here is how the system actually works, what to put on your pages, and the common reasons you are currently not cited.

How Perplexity picks its sources

Perplexity runs its own web crawler called PerplexityBot — a dedicated indexing bot that is explicitly not used to train AI foundation models, only to build the retrieval index behind Perplexity's real-time answers. That distinction matters: Perplexity's index is separate from Google's, Bing's, and any other search engine. Ranking on Google is correlated with Perplexity citations but is neither necessary nor sufficient.

When a query comes in, Perplexity's Sonar models run retrieval, ranking, synthesis, and attribution as a single pipeline. The system retrieves candidate documents, ranks them by query relevance and source authority, synthesizes a combined answer, and then exposes only the subset of sources whose content actually survived the synthesis step. That last filter — the attribution step — is why a page can be retrieved and ranked and still not appear as a citation.

What the pipeline rewards:

Direct answer in first 100 words
Effect on citation rate
~90% of top-cited pages follow this pattern
FAQ Schema markup
Effect on citation rate
~42% higher citation rate for question queries
HowTo Schema markup
Effect on citation rate
~38% lift for process queries
Article Schema with author + date
Effect on citation rate
~23% lift for informational content
Original proprietary data
Effect on citation rate
~3.2× more citations than derivative content
Freshness (updated ≤30 days)
Effect on citation rate
Significantly higher retrieval frequency

These figures come from third-party GEO research and practitioner experiments — Perplexity does not publish an official ranking factor list. Treat them as directional, not exact.

What to put on the page

Answer first, every time. Perplexity's models decide whether a passage is relevant by reading the opening. If the first 100 words are a preamble, the system moves to the next candidate. If the first two sentences state the direct answer, you stay in the pool.

This is editing, not a rewrite. Take your current introduction, move the actual answer to the top, and compress the preamble into one transitional sentence below it.

Structure that helps the model parse you:

  • Section headings as questions. Format ## headings as the actual question the section answers ("How does Perplexity rank sources?" not "Source Ranking"). This maps directly to query matching.
  • Self-contained 200-400 word sections. Each ## section should be readable as a standalone answer — no dangling references, no "as we discussed above." Perplexity often cites a single passage, not the full page.
  • Tables for comparisons and data. Structured data in Markdown tables is easier for models to parse and synthesize than the same information in prose.
  • A visible FAQ block. Add 3-5 Q&A pairs at the bottom of your page, formatted as actual questions with complete answers. This is the same content you add FAQ Schema to, and it gives the model a clean, parseable answer-question index.

Freshness is not optional for competitive queries. Pages updated within the last 30 days are retrieved significantly more often than the same content sitting untouched since 2023. For evergreen pages — service descriptions, guides, comparison pages — schedule a quarterly refresh: update one statistic, add one new section, and update your dateModified schema field.

Structured data that helps

Perplexity's Sonar models read schema markup before they read your page body. Think of it as a header that tells the model what type of content follows and who is responsible for it.

The three schemas worth adding:

FAQ Schema (JSON-LD) — Add this to any page with a Q&A section. It is the highest-leverage markup for informational queries, where Perplexity is specifically trying to match a question to a clear answer.

Article Schema — Add this to every blog post and guide. The critical fields are datePublished, dateModified, author.name, and author.url (pointing to a real profile — your LinkedIn or your About page). A December 2025 Moz experiment found that adding author schema with LinkedIn verification increased citation probability by 19% across a test set of 500 articles.

HowTo Schema — Add this to any page that walks through a process step by step. Each step becomes a discrete retrieval target, which is why HowTo pages perform well on process queries.

What you do NOT need: Dataset Schema is powerful for statistically-heavy reference pages but irrelevant for most small business content. Product Schema helps e-commerce queries, not informational ones. Start with FAQ + Article + HowTo and move on.

Common reasons you are not cited

1. PerplexityBot is blocked. The most common silent killer. If your robots.txt has a broad Disallow: / or blocks PerplexityBot by name, you are invisible regardless of content quality. Check this first.

2. Your page answers the category, not the question. "We provide comprehensive marketing services to businesses of all sizes" is not an answer to any specific query. Perplexity is matching passages to specific questions. A page that explains everything about marketing services will lose to a 600-word page that answers "how much should a small business spend on digital marketing" specifically.

3. Thin authority. Perplexity's citation algorithm concentrates among a small number of domains per topic — authority compounds. If your domain has no third-party mentions, no real author profiles, and no presence in industry directories, fix the entity signals before optimizing page-level copy.

4. Stale content. A guide published in 2022 and never touched will lose to a slightly worse guide updated this quarter. For any query where freshness matters (pricing, tools, regulations, market data), Perplexity heavily favors recent sources.

5. No structured data. Pages without schema markup require Perplexity's models to infer page type, authorship, and date from unstructured HTML. That inference is imperfect. Schema removes the guesswork.


Getting cited by Perplexity is an engineering problem, not a content volume problem. You do not need more pages — you need your existing best pages to clear every filter in the retrieval pipeline. Allow the crawler, answer first, add schema, build cross-platform authority, and refresh quarterly. That combination covers the five main failure modes and puts you in the pool of sources Perplexity actually considers.

Frequently asked questions

Does Perplexity use Google's index?

No. Perplexity maintains its own proprietary index built by PerplexityBot, its dedicated web crawler. PerplexityBot does not crawl for AI model training — it crawls to populate the retrieval index that powers Perplexity's real-time answers. This means ranking on Google is neither necessary nor sufficient to be cited by Perplexity, though pages that are crawlable and authoritative tend to do well in both.

How fast does Perplexity pick up new pages?

Because Perplexity uses real-time web retrieval on top of its index, new pages can begin appearing in answers within 2-4 weeks of publication — faster than Google for fresh, authoritative content. Freshness is also a ranking signal: pages updated within the last 30 days are retrieved significantly more often than stale content for the same query.

Do backlinks matter for Perplexity citations?

Traditional backlink count matters less than cross-platform authority signals. Perplexity's citation algorithm weighs mentions across trusted third-party platforms — industry directories, review sites, media coverage — alongside domain authority. Research from Ahrefs found that unlinked brand mentions correlate roughly 3× more with AI visibility than raw backlink count, which suggests Perplexity is reading entity recognition signals, not just link graphs.

Does structured data directly affect Perplexity citations?

Yes, and the effect sizes are notable. FAQ Schema is associated with a 42% higher citation rate for question-based queries; HowTo Schema with a 38% lift for process queries; Article Schema with a 23% lift for informational content. These figures come from third-party GEO research and have not been confirmed by Perplexity directly, but the directional signal is consistent: schema markup helps Perplexity's models parse what a page answers, which makes it easier to match to queries.

Can I block PerplexityBot if I don't want to be cited?

Yes. PerplexityBot respects robots.txt. Add "User-agent: PerplexityBot / Disallow: /" to block it from your entire site, or target specific directories. Some publishers have done this in response to questions about content licensing; the trade-off is that blocking PerplexityBot removes you from Perplexity citations entirely.

Want every AI tool to sound like your brand?

Find out in 60 seconds. The AI Brand Algorithm makes ChatGPT, Claude, Gemini and more sound exactly like you.

See if you’re a fit — 60 sec