GEO Glossary 2026: 30 Essential Terms to Know

8 min readPublished on June 3, 2026

GEO vocabulary crystallized between 2024 and 2026 as ChatGPT, Claude, Perplexity and Google AI Overview became measurable acquisition channels. In April 2026, Sistrix measured that 58% of French Google queries triggered an AI Overview, and Ahrefs had analyzed 75,000 brands to map AI citation patterns. Yet terms still float around: answer-first, llms.txt, embedding, chunk, OAI-SearchBot, unlinked mention. This glossary sets the definitions a SEO lead, a SaaS founder, or a consultant scoping an audit actually needs. Thirty terms, organized by use case, with the technical nuance required to avoid confusing training crawlers with response crawlers, or citations with brand mentions.

The fundamentals: GEO, AEO, LLM SEO

Three acronyms for the same fight: being visible when a user asks a generative AI a question. The nuances matter internally but converge in practice on the same technical and editorial levers.

1. GEO (Generative Engine Optimization)

Optimizing a site to be cited by generative engines (ChatGPT, Claude, Perplexity, Gemini, Google AI Overview). The term was popularized by the Princeton, Allen Institute and Georgia Tech paper published in November 2023, which demonstrated 30-40% visibility gains on certain rewriting techniques.

2. AEO (Answer Engine Optimization)

Frequent synonym for GEO, with a nuance: AEO emphasizes answer format (questions, FAQ, snippets) while GEO also covers long-form content and authority layers. In most client briefs in 2026, both terms are interchangeable.

3. LLM SEO

English-speaking variant, still used by some agencies. Strictly refers to Large Language Model optimization, without including hybrid engines like Perplexity that mix search and generation.

AI crawlers to know

An AI crawler is a robot that visits your site for two distinct reasons: training a model, or answering a user query in real time. The distinction matters because it dictates the robots.txt policy to apply.

4. GPTBot

OpenAI's training crawler, publicly documented. User-agent: GPTBot. According to Vercel and MERJ, GPTBot generated more than 500 million fetches on their infrastructure across 2024-2025. Allow it if you want to feed future GPT models.

5. OAI-SearchBot

OpenAI's response crawler, used by ChatGPT Search to fetch sources in real time. Publicly documented by OpenAI. This is the one you must allow first to be cited in current ChatGPT answers.

6. ClaudeBot

Anthropic's crawler, publicly documented. Serves both training and Claude's search features. A single user-agent, which simplifies management but prevents selective blocking.

7. PerplexityBot

Perplexity's crawler, combining indexing and on-demand fetch. Especially active on B2B SaaS and technical content.

8. Google-Extended

Specific token to signal Google not to use your content in Bard/Gemini, without blocking classic Googlebot. Lets you separate traditional SEO from Google's AI training.

Formats LLMs cite

An LLM does not read a page like a human: it chunks, vectorizes, ranks by relevance. Formats that maximize citation share three properties: semantic autonomy, predictable structure, sourceable facts.

9. Answer-first

Writing principle of delivering the answer in 1-2 sentences right at the opening of an H2 or paragraph, before any development. It is the format most extracted by LLMs according to patterns observed in Yext and Semrush studies.

10. JSON-LD

Structured format recommended by schema.org to describe your entities (Article, FAQPage, Organization, HowTo). According to Ahrefs (March 2026, 1,885 pages tested), pages with valid JSON-LD are cited 2-3 times more often by ChatGPT.

11. llms.txt

Text file at the site root, proposed in 2024 by Jeremy Howard, summarizing site content in an LLM-friendly format. Adoption is still minority in 2026 but it is a positive signal for engines that support it.

12. Chunk

Unit of document segmentation before vectorization. An LLM never cites a full page: it cites a chunk of 200 to 800 tokens. Making each section autonomous means optimizing at the chunk level.

13. Embedding

Vector representation of a text, used by LLMs to measure semantic similarity with a query. Two pieces of content can be cited for the same question if their embeddings are close to the question's vector.

14. RAG (Retrieval Augmented Generation)

Architecture where the LLM fetches external documents before generating its answer. Perplexity, ChatGPT Search and AI Overview run on RAG: your content does not need to be inside the model, it must be findable at query time.

Essential GEO metrics

Once the technical vocabulary is set, business KPIs become readable. Three families: visibility (you are cited), quality (you are cited well), conversion (you get clicked).

15. Citation

Mention of your site as a source in an AI answer, with or without a clickable link. According to Yext (6.8 million citations analyzed), citation is the indicator most correlated with measurable AI traffic.

16. Brand mention

Appearance of your brand name in an AI answer, even without a link. Ahrefs (March 2026, 75,000 brands) showed that unlinked brand mentions represent 60-70% of occurrences in B2B SaaS.

Your share of citations across all citations in a semantic cluster (e.g. all CRM tools). Reference metric to benchmark against your competitors.

18. AI Overview

Answer generated by Google at the top of SERPs, with cited sources. Sistrix measured 58% of French Google queries triggering an AI Overview in April 2026. Appearing in its sources is now a full-fledged SEO objective.

19. Citability

Qualitative score measuring the probability that a piece of content gets cited by an LLM. The ScoreGeo methodology evaluates 13 weighted criteria over 100 points to produce this score.

Off-page levers and authority

LLMs partly inherit classic SEO logic: authority, freshness, expertise signals. But they add a layer of their own, centered on semantic consistency between your third-party mentions.

20. Off-page authority

Set of external signals (backlinks, press mentions, consistent LinkedIn profiles, Wikipedia) that contribute to authority as perceived by an LLM. Working off-page authority remains the least automatable and most durable lever.

21. E-E-A-T applied to LLMs

Experience, Expertise, Authoritativeness, Trustworthiness. Google's framework extends to LLMs: they favor sources with identified authors, public methodology, last-updated dates, and external counter-references.

22. AI co-citation

Simultaneous appearance of your brand with a reference brand in the same AI answer. Powerful indicator of positioning as perceived by models.

23. Wikipedia entity

Presence of a validated Wikipedia page for your brand or your founder. LLMs rely heavily on Wikipedia as an entity truth source.

Server-side technical practices

On the infrastructure side, three topics come up in every GEO audit: who can crawl what, at what frequency, and with what performance footprint.

24. Extended robots.txt

robots.txt file that explicitly distinguishes AI crawlers. 2026 convention: allow OAI-SearchBot, ClaudeBot, PerplexityBot, and decide on GPTBot based on your training strategy.

25. Sitemap freshness

lastmod field updated on every edit. AI crawlers, especially OAI-SearchBot, prioritize freshly modified URLs to refresh their answers.

26. SSR (Server-Side Rendering)

Server-side rendering. Most AI crawlers execute little to no JavaScript. A fully client-side site risks serving an empty shell to LLMs.

27. Clean markdown

Simple HTML structure, hierarchical H1-H2-H3, real lists (<ul>, <ol>), semantic tables. LLMs extract clean markdown better than cosmetic HTML.

For a full audit of these technical signals, see the ScoreGeo methodology detailing the 13 weighted criteria. If you prefer operational support, our team's GEO engagement covers the full scope in 6 weeks.

AI conversion indicators

Measuring the business impact of GEO requires dedicated metrics, distinct from classic SEO. Three are emerging as standards in 2026.

28. AI referer traffic

Visits identified as coming from an AI answer (referer ChatGPT, Perplexity, Claude, or UTM parameter added manually). Represents the directly attributable share of GEO.

29. Branded query lift

Increase in searches for your brand following AI exposure. Measurable via Search Console (impressions on branded queries). Often the first measurable signal of working GEO.

30. AI assist rate

Share of B2B conversions where the prospect mentions having discovered or validated your solution via an AI. Declarative indicator to add to demo forms or sales calls.

Thirty terms do not cover everything: GEO vocabulary keeps evolving each quarter. To stay current, subscribing to the ScoreGeo newsletter gives access to glossary updates and new concepts as they stabilize.

How to use this glossary

A glossary only serves to clarify conversations. Three concrete uses: before an audit, to align vocabulary with your vendor; during writing, to check that a brief uses the right terms; after a deployment, to measure what should be measured. If some terms remain unclear after reading, that is likely the signal that a manual GEO audit would deliver more value than self-training. The GEO vs SEO distinction, in particular, often deserves a dedicated session to avoid badly framed trade-offs on classic GEO mistakes.

Frequently asked questions

What is the difference between GEO and classic SEO?

Classic SEO optimizes for ranking in search engine result pages (Google, Bing). GEO optimizes for citation by generative engines (ChatGPT, Claude, Perplexity, AI Overview). Levers overlap 60-70% (content quality, structure, authority) but GEO adds specifics: answer-first format, strict JSON-LD, AI crawler allowances, third-party brand mentions.

Should I block GPTBot to protect my content?

Blocking GPTBot prevents your content from training future GPT models but has no effect on your current citation in ChatGPT Search, which runs via OAI-SearchBot. For most B2B SaaS companies aiming to maximize AI visibility, blocking GPTBot is counterproductive. For a premium content publisher, blocking can be justified.

Is llms.txt mandatory in 2026?

No, llms.txt remains an emerging convention, not a standard enforced by major AI engines. Adoption is minority and measured impact remains limited. It costs nothing to deploy and can be a positive signal, but it is not a priority compared to clean JSON-LD and a well-configured robots.txt.

How do I measure if I am cited by ChatGPT?

Three methods: (1) query ChatGPT manually on 20-50 target prompts and log citations; (2) use a GEO monitoring tool that automates these queries; (3) analyze referer traffic in GA4 or Plausible (referer chat.openai.com, perplexity.ai, claude.ai). Manual remains the most reliable to start.

How many glossary terms must I master to get started?

For a first GEO project, ten terms are enough: GEO, answer-first, JSON-LD, GPTBot, OAI-SearchBot, ClaudeBot, citation, brand mention, AI share of voice, RAG. The other twenty become useful as maturity grows, notably to benchmark against competitors or arbitrate technically.

Which terms are the newest in 2026?

Concepts that emerged in 2025-2026 are: OAI-SearchBot (distinct from GPTBot), AI share of voice as a formalized metric, AI co-citation, branded query lift attributed to GEO, and declarative AI assist rate in B2B. The fundamentals (answer-first, JSON-LD, citation) have been stable since 2023-2024.

Is this glossary kept up to date?

Yes, ScoreGeo revises this glossary each quarter to reflect vocabulary evolutions and new measurable signals. Recent additions are flagged at the top of the article. To be notified of updates, newsletter subscription remains the simplest path.