GEO Glossary 2026: 30 Essential Terms to Know
GEO vocabulary crystallized between 2024 and 2026 as ChatGPT, Claude, Perplexity and Google AI Overview became measurable acquisition channels. In April 2026, Sistrix measured that 58% of French Google queries triggered an AI Overview, and Ahrefs had analyzed 75,000 brands to map AI citation patterns. Yet terms still float around: answer-first, llms.txt, embedding, chunk, OAI-SearchBot, unlinked mention. This glossary sets the definitions a SEO lead, a SaaS founder, or a consultant scoping an audit actually needs. Thirty terms, organized by use case, with the technical nuance required to avoid confusing training crawlers with response crawlers, or citations with brand mentions.
The fundamentals: GEO, AEO, LLM SEO
Three acronyms for the same fight: being visible when a user asks a generative AI a question. The nuances matter internally but converge in practice on the same technical and editorial levers.
1. GEO (Generative Engine Optimization)
Optimizing a site to be cited by generative engines (ChatGPT, Claude, Perplexity, Gemini, Google AI Overview). The term was popularized by the Princeton, Allen Institute and Georgia Tech paper published in November 2023, which demonstrated 30-40% visibility gains on certain rewriting techniques.
2. AEO (Answer Engine Optimization)
Frequent synonym for GEO, with a nuance: AEO emphasizes answer format (questions, FAQ, snippets) while GEO also covers long-form content and authority layers. In most client briefs in 2026, both terms are interchangeable.
3. LLM SEO
English-speaking variant, still used by some agencies. Strictly refers to Large Language Model optimization, without including hybrid engines like Perplexity that mix search and generation.
AI crawlers to know
An AI crawler is a robot that visits your site for two distinct reasons: training a model, or answering a user query in real time. The distinction matters because it dictates the robots.txt policy to apply.
4. GPTBot
OpenAI's training crawler, publicly documented. User-agent: GPTBot. According to Vercel and MERJ, GPTBot generated more than 500 million fetches on their infrastructure across 2024-2025. Allow it if you want to feed future GPT models.
5. OAI-SearchBot
OpenAI's response crawler, used by ChatGPT Search to fetch sources in real time. Publicly documented by OpenAI. This is the one you must allow first to be cited in current ChatGPT answers.
6. ClaudeBot
Anthropic's crawler, publicly documented. Serves both training and Claude's search features. A single user-agent, which simplifies management but prevents selective blocking.
7. PerplexityBot
Perplexity's crawler, combining indexing and on-demand fetch. Especially active on B2B SaaS and technical content.
8. Google-Extended
Specific token to signal Google not to use your content in Bard/Gemini, without blocking classic Googlebot. Lets you separate traditional SEO from Google's AI training.
Formats LLMs cite
An LLM does not read a page like a human: it chunks, vectorizes, ranks by relevance. Formats that maximize citation share three properties: semantic autonomy, predictable structure, sourceable facts.
9. Answer-first
Writing principle of delivering the answer in 1-2 sentences right at the opening of an H2 or paragraph, before any development. It is the format most extracted by LLMs according to patterns observed in Yext and Semrush studies.
10. JSON-LD
Structured format recommended by schema.org to describe your entities (Article, FAQPage, Organization, HowTo). According to Ahrefs (March 2026, 1,885 pages tested), pages with valid JSON-LD are cited 2-3 times more often by ChatGPT.
11. llms.txt
Text file at the site root, proposed in 2024 by Jeremy Howard, summarizing site content in an LLM-friendly format. Adoption is still minority in 2026 but it is a positive signal for engines that support it.
12. Chunk
Unit of document segmentation before vectorization. An LLM never cites a full page: it cites a chunk of 200 to 800 tokens. Making each section autonomous means optimizing at the chunk level.
13. Embedding
Vector representation of a text, used by LLMs to measure semantic similarity with a query. Two pieces of content can be cited for the same question if their embeddings are close to the question's vector.
14. RAG (Retrieval Augmented Generation)
Architecture where the LLM fetches external documents before generating its answer. Perplexity, ChatGPT Search and AI Overview run on RAG: your content does not need to be inside the model, it must be findable at query time.
Essential GEO metrics
Once the technical vocabulary is set, business KPIs become readable. Three families: visibility (you are cited), quality (you are cited well), conversion (you get clicked).
15. Citation
Mention of your site as a source in an AI answer, with or without a clickable link. According to Yext (6.8 million citations analyzed), citation is the indicator most correlated with measurable AI traffic.
16. Brand mention
Appearance of your brand name in an AI answer, even without a link. Ahrefs (March 2026, 75,000 brands) showed that unlinked brand mentions represent 60-70% of occurrences in B2B SaaS.
17. AI share of voice
Your share of citations across all citations in a semantic cluster (e.g. all CRM tools). Reference metric to benchmark against your competitors.
18. AI Overview
Answer generated by Google at the top of SERPs, with cited sources. Sistrix measured 58% of French Google queries triggering an AI Overview in April 2026. Appearing in its sources is now a full-fledged SEO objective.
19. Citability
Qualitative score measuring the probability that a piece of content gets cited by an LLM. The ScoreGeo methodology evaluates 13 weighted criteria over 100 points to produce this score.
Off-page levers and authority
LLMs partly inherit classic SEO logic: authority, freshness, expertise signals. But they add a layer of their own, centered on semantic consistency between your third-party mentions.
20. Off-page authority
Set of external signals (backlinks, press mentions, consistent LinkedIn profiles, Wikipedia) that contribute to authority as perceived by an LLM. Working off-page authority remains the least automatable and most durable lever.
21. E-E-A-T applied to LLMs
Experience, Expertise, Authoritativeness, Trustworthiness. Google's framework extends to LLMs: they favor sources with identified authors, public methodology, last-updated dates, and external counter-references.
22. AI co-citation
Simultaneous appearance of your brand with a reference brand in the same AI answer. Powerful indicator of positioning as perceived by models.
23. Wikipedia entity
Presence of a validated Wikipedia page for your brand or your founder. LLMs rely heavily on Wikipedia as an entity truth source.
Server-side technical practices
On the infrastructure side, three topics come up in every GEO audit: who can crawl what, at what frequency, and with what performance footprint.
24. Extended robots.txt
robots.txt file that explicitly distinguishes AI crawlers. 2026 convention: allow OAI-SearchBot, ClaudeBot, PerplexityBot, and decide on GPTBot based on your training strategy.
25. Sitemap freshness
lastmod field updated on every edit. AI crawlers, especially OAI-SearchBot, prioritize freshly modified URLs to refresh their answers.
26. SSR (Server-Side Rendering)
Server-side rendering. Most AI crawlers execute little to no JavaScript. A fully client-side site risks serving an empty shell to LLMs.
27. Clean markdown
Simple HTML structure, hierarchical H1-H2-H3, real lists (<ul>, <ol>), semantic tables. LLMs extract clean markdown better than cosmetic HTML.
For a full audit of these technical signals, see the ScoreGeo methodology detailing the 13 weighted criteria. If you prefer operational support, our team's GEO engagement covers the full scope in 6 weeks.
AI conversion indicators
Measuring the business impact of GEO requires dedicated metrics, distinct from classic SEO. Three are emerging as standards in 2026.
28. AI referer traffic
Visits identified as coming from an AI answer (referer ChatGPT, Perplexity, Claude, or UTM parameter added manually). Represents the directly attributable share of GEO.
29. Branded query lift
Increase in searches for your brand following AI exposure. Measurable via Search Console (impressions on branded queries). Often the first measurable signal of working GEO.
30. AI assist rate
Share of B2B conversions where the prospect mentions having discovered or validated your solution via an AI. Declarative indicator to add to demo forms or sales calls.
Thirty terms do not cover everything: GEO vocabulary keeps evolving each quarter. To stay current, subscribing to the ScoreGeo newsletter gives access to glossary updates and new concepts as they stabilize.
How to use this glossary
A glossary only serves to clarify conversations. Three concrete uses: before an audit, to align vocabulary with your vendor; during writing, to check that a brief uses the right terms; after a deployment, to measure what should be measured. If some terms remain unclear after reading, that is likely the signal that a manual GEO audit would deliver more value than self-training. The GEO vs SEO distinction, in particular, often deserves a dedicated session to avoid badly framed trade-offs on classic GEO mistakes.
Frequently asked questions
What is the difference between GEO and classic SEO?
Classic SEO optimizes for ranking in search engine result pages (Google, Bing). GEO optimizes for citation by generative engines (ChatGPT, Claude, Perplexity, AI Overview). Levers overlap 60-70% (content quality, structure, authority) but GEO adds specifics: answer-first format, strict JSON-LD, AI crawler allowances, third-party brand mentions.
Should I block GPTBot to protect my content?
Blocking GPTBot prevents your content from training future GPT models but has no effect on your current citation in ChatGPT Search, which runs via OAI-SearchBot. For most B2B SaaS companies aiming to maximize AI visibility, blocking GPTBot is counterproductive. For a premium content publisher, blocking can be justified.
Is llms.txt mandatory in 2026?
No, llms.txt remains an emerging convention, not a standard enforced by major AI engines. Adoption is minority and measured impact remains limited. It costs nothing to deploy and can be a positive signal, but it is not a priority compared to clean JSON-LD and a well-configured robots.txt.
How do I measure if I am cited by ChatGPT?
Three methods: (1) query ChatGPT manually on 20-50 target prompts and log citations; (2) use a GEO monitoring tool that automates these queries; (3) analyze referer traffic in GA4 or Plausible (referer chat.openai.com, perplexity.ai, claude.ai). Manual remains the most reliable to start.
How many glossary terms must I master to get started?
For a first GEO project, ten terms are enough: GEO, answer-first, JSON-LD, GPTBot, OAI-SearchBot, ClaudeBot, citation, brand mention, AI share of voice, RAG. The other twenty become useful as maturity grows, notably to benchmark against competitors or arbitrate technically.
Which terms are the newest in 2026?
Concepts that emerged in 2025-2026 are: OAI-SearchBot (distinct from GPTBot), AI share of voice as a formalized metric, AI co-citation, branded query lift attributed to GEO, and declarative AI assist rate in B2B. The fundamentals (answer-first, JSON-LD, citation) have been stable since 2023-2024.
Is this glossary kept up to date?
Yes, ScoreGeo revises this glossary each quarter to reflect vocabulary evolutions and new measurable signals. Recent additions are flagged at the top of the article. To be notified of updates, newsletter subscription remains the simplest path.