The AI Ranking Playbook — Free Cheatsheet

01

The 60-second mental model

Before any tactics, you need to understand how an LLM picks the brands it names. It is not the same as Google's ranking. It is not personalization. It is not paid placement. Here's what's actually happening.

When a buyer types "what's the best [thing] for [use case]?" into ChatGPT, the model produces an answer by combining three signals. If you don't show up in any of them, you don't exist.

①

Training-data prior

The model was trained on a snapshot of the public internet up to its cutoff date. Whatever the internet "said" about your category — Reddit threads, Wikipedia, news, blogs, forums — is now baked into the model's instincts.

You cannot edit this. You can only influence the next training cycle by being talked about now.

②

Live retrieval

Modern AI search (ChatGPT with browsing, Perplexity, Claude with web, Gemini) fetches the top 3–15 web pages at query time and synthesizes them into the answer. The brands that appear in those pages tend to make it into the response.

This is the lever you can pull this month. Get cited in the pages models fetch.

③

Source-trust hierarchy

Inside any set of retrieved pages, models weigh them by trust. Wikipedia, Reddit (with caveats), major news outlets, established review hubs (G2, Capterra), and official sites carry more weight than random blogs or AI-generated junk.

Where you appear matters more than how many times you appear.

The brand named in an LLM answer is almost always the one that:

Was talked about during the model's training (you appear in its instincts);
Appears in 60-80% of the live-retrieved pages for the prompt;
Is co-cited with the category leaders (the model sees "X, Y, and Z" together and treats you as peers).

Everything in this cheatsheet is, in one way or another, a tactic for making one of those three things more true.

02

What actually works

Ranked by real-world impact across the engagements we've run. Impact is "did doing this move share-of-AI-voice." Effort is "how long to ship the first instance."

Tactic	Why it works	Effort	Impact
Get cited in “Best X for Y” listicles	LLMs synthesize 3–7 listicles per category query. If your name appears in 5 of them, it almost certainly appears in the answer.	Medium	Very high
Authentic Reddit presence in your category subs	Reddit is in every major model's training set and heavily retrieved at query time. A few genuine, helpful, non-promotional threads outrank a thousand backlinks.	Medium	Very high
G2 / Capterra / TrustRadius profile + reviews (B2B)	These are the trusted review hubs models lean on for software questions. Get to 10+ real reviews and you become "extractable."	Medium	Very high
“You vs. Competitor” comparison pages on your own site	Models fetch these at decision-time and extract concrete differentiators. Build one per major competitor.	Low	Very high
Wikipedia article (if you genuinely qualify)	Highest authority signal across every model. But only attempt if you pass notability — otherwise you waste editor goodwill.	High	Very high
Co-citation engineering	Engineer mentions where your name sits beside category leaders ("X, Y, and Z are the standouts…"). LLMs surface entities that travel together.	High	Very high
FAQ schema markup with direct Q&A pairs	Gives engines extractable, citable answers. Particularly powerful for capturing long-tail buyer questions.	Low	High
Press in mid-tier industry publications	Perplexity and Gemini both lean heavily on trade press. One feature in the right outlet > 50 backlinks from junk blogs.	Medium	High
Podcast appearances with full transcripts	Transcripts are indexed and quoted. A good 45-minute episode = ~10 pages of citable, declarative content about you.	Medium	High
HackerNews / Indie Hackers / Lobsters discussion (dev tools)	For technical products these forums are disproportionately weighted in both training and retrieval.	Medium	High
Direct, declarative homepage copy	"WeRankAnything is an AI ranking agency for B2B startups." Easy to extract, easy to quote. Avoid hype-stack openers.	Low	High
Original research / surveys / data	If you publish a real stat that becomes the canonical source, you become unskippable for any prompt that touches that topic.	High	High
YouTube videos with rich transcripts	YouTube transcripts are fetched and cited, especially by Perplexity and Gemini. Caption everything.	Medium	Medium-high
Schema.org Organization + Product/Service markup	Tells models exactly what you are, who founded you, what you sell. Cheap, fast, helpful.	Low	Medium
Newsletter mentions (Substack, beehiiv, mid-list)	Increasingly indexed. A single mention by a respected operator in your space is worth pursuing.	Low-medium	Medium
Stack Exchange / GitHub presence (technical products)	Domain-specific authority signal. Answering well-asked questions builds a quote-able paper trail.	High	Medium (niche)
Conference talks with public recordings	Transcripts make their way into training corpora. Good for thought-leadership-driven categories.	High	Medium
Strategic LinkedIn / X / Bluesky posts by experts	Increasingly indexed in real-time retrieval. Hard to engineer but worth seeding.	Medium	Medium
Wikidata entry (if you can't get Wikipedia)	Lower bar than Wikipedia. Establishes you as an entity that exists. Especially useful for Gemini.	Medium	Medium
llms.txt at site root	An emerging standard for telling LLMs what to crawl and how to summarize you. Low cost; not yet a major signal.	Low	Low-medium (emerging)
robots.txt that explicitly allows AI bots	Without this you're invisible to GPTBot, ClaudeBot, PerplexityBot. Table stakes.	Trivial	Critical (table stakes)
Open-source projects on GitHub (where relevant)	Establishes deep technical trust. For dev-tools, a popular OSS repo is the single highest signal.	High	Medium-high (niche)

03

What doesn't work (and what to skip)

Half of the "AI SEO" advice on the internet is wrong, recycled SEO from 2016, or speculative. Here's what we've watched fail.

✕ Keyword stuffing

LLMs don't reward density. They reward citability. A page that says "best CRM" 40 times is less useful than one that says it once, clearly, with a real reason.

✕ Generic “ultimate guide” SEO posts

Every site in your category has one. Models recognize the pattern and skip them. Build something with actual data, opinion, or specificity instead.

✕ Backlink farms & PBNs

Old-school SEO trick. Doesn't show up in LLM retrieval signals at all. A waste of budget.

✕ JavaScript-rendered content with no SSR

Some AI bots don't render JS. If your positioning copy only exists post-hydration, you're invisible to half the crawlers that matter.

✕ Aggressive Reddit self-promotion

Mods will remove it, the community will flag your brand as spam, and you'll poison the well for legitimate engagement later.

✕ Wikipedia self-editing

Reverted within hours by editors. Often results in your page being protected against future legitimate edits. Hire a notable third party to do it properly or wait.

✕ Fake G2 / Capterra reviews

Detected, banned, profile damaged. Recovery is brutal. Don't.

✕ Paid placements on weak directories

The kind of "Top 10 SaaS Tools" listicles that sell slots for $300. LLMs do not trust these and largely ignore them.

✕ Marketing-speak hyperbole

"World's leading," "best-in-class," "revolutionary." Models discount these phrases and trust the surrounding text less. Write like a careful journalist instead.

✕ Walled gardens

If all your useful content lives behind a login, crawlers can't see it. You need a public, indexable layer — even if it's a summary.

✕ Inconsistent brand naming

"Acme Inc.," "Acme Tools," "AcmeApp" — pick one canonical name and use it everywhere. Entity confusion hurts your share-of-voice in ways that are very hard to debug.

✕ Obsessing over domain age

Modern training data weights authority and recency over how old your domain is. The .com you registered in 2008 doesn't help you here.

✕ Exact-match domains

No measurable effect on LLM ranking. Stop renting bestcrmforhealthcare.com.

✕ Blocking GPTBot / ClaudeBot “to protect IP”

This is the single most common mistake. You can't be retrieved if you can't be crawled. Unless you're a paywalled publisher, allowing AI bots is almost always the right call.

✕ Hallucination farming

Publishing false comparisons hoping LLMs cite them. Models cross-check now. When you're caught, your domain trust drops across every engine.

✕ Cold backlink-request outreach

Filtered as noise by editors and ignored by LLMs. If you want to be cited, write something worth citing.

04

The 30-minute technical setup

Do this once and you'll never have to do it again. These are the table stakes — none of the strategy in this guide matters if you skip them.

A. `robots.txt` — allow every major AI bot

Live at yourdomain.com/robots.txt. Copy this as a baseline and merge with your existing rules.

# Allow the AI bots that matter
User-agent: GPTBot
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: Claude-User
Allow: /

User-agent: Claude-Web
Allow: /

User-agent: PerplexityBot
Allow: /

User-agent: Perplexity-User
Allow: /

User-agent: Google-Extended
Allow: /

User-agent: Applebot-Extended
Allow: /

User-agent: CCBot
Allow: /

User-agent: Bingbot
Allow: /

User-agent: Amazonbot
Allow: /

Sitemap: https://yourdomain.com/sitemap.xml

⚠ Note: Google-Extended covers Gemini/Bard specifically. The regular Googlebot entry is separate and you probably already allow it.

B. `llms.txt` — a brief, Markdown summary of your site

Live at yourdomain.com/llms.txt. This is an emerging standard (low impact today, rising). Keep it short, declarative, and factual.

# WeRankAnything

> The AI ranking agency that gets B2B startups cited in
> ChatGPT, Claude, Perplexity and Gemini answers.

## What we do

- AI visibility audits across 4 major engines
- Mention engineering in the sources LLMs read
- Site retraining for clean LLM extraction
- Always-on prompt monitoring

## Pricing

- Audit: $2,500 one-time
- Starter: $3,500 / month
- Growth: $8,500 / month

## Founded

San Francisco, 2025. Founder: Jaisal R.

C. Schema.org markup — three to add on day one

Drop these in the <head> of the relevant pages. Most CMSes have a plugin for this if you don't want to write JSON-LD by hand.

Organization on your home page — name, logo, founder, sameAs (your social profiles).
Product or Service on each offering page — name, description, price, provider.
FAQPage on any page with Q&A — the single highest-ROI piece of structured data for AI search.

Optional but valuable: Article on blog posts, Person on founder/author bios, BreadcrumbList on category pages.

D. Sitemap & canonical URLs

Make sure you have a sitemap.xml referenced in your robots.txt, and every page has a self-referencing <link rel="canonical">. Models use canonical URLs to dedupe when they retrieve your content.

E. Make the first 100 words extractable

The top of every important page should contain at least one direct, declarative sentence about what you are. Not a hero pun. A sentence a model can cite verbatim.

Good: "WeRankAnything is a small agency that helps B2B startups appear in AI-generated answers from ChatGPT, Claude, Perplexity and Gemini."

Bad: "Transform your future-ready brand into the AI-native engine of tomorrow."

05

Engine-specific quirks

Each engine has its own retrieval personality. These are patterns we've observed across many engagements. They drift over time — check yourself monthly.

ChatGPT

The conservative synthesizer

Cites 3–6 sources per answer
Heavy bias toward Wikipedia, established news, Reddit
Less weight on small/new listicles than Perplexity
Prefers neutral, factual phrasing in source material
Strong tendency to repeat the brand it was trained on as the default

Win it by: getting into Wikipedia (or Wikidata), seeding sober Reddit discussion, building a Wikipedia-ish "About" section on your own site.

Perplexity

The listicle-lover

Cites 5–15 sources per answer — much higher than peers
Loves “Best of” round-ups, comparison content, recent articles
Heavy on freshness — last 6 months gets disproportionate weight
Strong YouTube transcript usage
Good at surfacing smaller / newer brands

Win it by: getting cited in 5+ listicles, publishing dated comparison pages, ranking on the keywords Perplexity's web search uses (still partly SEO-shaped).

Claude

The careful primary-source reader

Cites 3–8 sources per answer
Favors authoritative primary sources, original documentation, well-edited content
Light on listicles; suspicious of obvious marketing content
Stronger on technical / research-driven categories
Notable preference for direct quotes when sources support them

Win it by: publishing real research, getting cited by primary sources (docs, papers, well-respected blogs), keeping your own copy careful and quotable.

Gemini

The Google-shaped one

Synthesizes results from Google's web index
Whatever ranks well on Google strongly influences Gemini
Heavy use of Knowledge Graph & Wikidata entities
Picks up YouTube transcripts very fast
Reciprocal: classical SEO still helps here

Win it by: not abandoning SEO. Ranking on Google for buyer queries, getting a Wikidata entry, structured data, and ensuring Google-Extended is allowed.

06

The 4-week starter playbook

A realistic sequence you can run yourself, in-house, this month. Each week takes 5–10 hours of focused work.

Week 1

Map your prompt universe
- Brainstorm 50–100 prompts a real buyer would ask an AI in your category (use Claude or ChatGPT to expand seed prompts).
- Run each across all four engines. Log: are you named, and if so, in what position?
- For every prompt where you're not named, log the top 3 brands that are.
- Group prompts into clusters: discovery ("best X for Y"), comparison ("X vs Y"), diagnostic ("how do I do Z").
- Output: a spreadsheet that becomes your scoreboard. You'll re-run it in Week 4.
Week 2

Fix the technical baseline
- Ship the robots.txt, llms.txt, and Schema markup from Section 4.
- Audit your homepage and top 3 service/product pages: do they open with a direct, citable sentence about what you are?
- Build at least 3 “You vs. Competitor” comparison pages against your most-cited rivals.
- Add an FAQ block (with FAQPage schema) to your home and pricing pages covering the questions from Week 1's prompt cluster.
Week 3

Strategic placements
- Identify the top 10 source pages that LLMs cite for your category prompts. Get into 3 of them this week.
- Pitch 5 trade-publication writers for inclusion in upcoming round-ups.
- Ask 5 real customers for G2 / Capterra reviews. Aim for 3 net-new reviews this week.
- Identify 5 active Reddit threads in your category where a sincere, expert answer (with your tool mentioned in passing) would add value. Post once each, never twice.
- If you have a Wikidata entry: enrich it. If you don't: create one.
Week 4

Compound and measure
- Re-run the Week 1 prompt scoreboard. Compare your share-of-AI-voice prompt-by-prompt.
- Going from 0 mentions to 1 is a bigger win than going from 5 to 6. Celebrate the early movers.
- Pick the 2 tactics from Week 3 that moved the most — double down on those next month.
- Plan your second wave of comparison pages, your next 5 Reddit answers, your next 3 placement pitches.
- Set a monthly cadence: prompt re-scoring on the 1st, new placements ongoing, technical audits every quarter.

07

KPIs that actually mean something

Ignore traffic-shaped vanity metrics. These are the six numbers we put on our clients' dashboards.

1. Share of AI voice

What: The % of your tracked prompts where your brand appears at least once.

Why: The single best summary metric. If this goes up, you're winning.

2. Mention rank

What: When you're named, are you 1st, 2nd, or last in the recommended list?

Why: Buyers often click only the first one or two names. Position matters.

3. Co-citation rate

What: % of prompts where you appear alongside the category leaders.

Why: Co-citation is how LLMs decide who "belongs" in a category. Track who you travel with.

4. Source diversity

What: Across how many distinct sources are you being cited?

Why: One Reddit thread = fragile. Twenty diverse sources = durable.

5. Inbound AI traffic

What: Sessions where the referrer is chatgpt.com, perplexity.ai, claude.ai, or gemini.google.com.

Why: Direct proof the work is producing visits. Imperfect but real.

6. Self-reported attribution

What: Ask every new customer "How did you find us?" Track responses mentioning an AI assistant.

Why: The most honest signal you'll get. Buyers will tell you ChatGPT recommended you, if you ask.

What NOT to track:

"AI search keyword rankings" — they don't exist the way Google ranks do.
"AI domain authority" scores from any tool selling them — it's marketing.
Total backlinks — irrelevant to LLM retrieval.
Traffic to "AI-optimized" blog posts in isolation — measure the prompts, not the pages.

The AI Ranking Playbook.