ScoreGeo

Schema.org for GEO: the 7 types that actually matter in 2026

11 min read

Schema.org has more than 800 types and properties, but LLMs use only a small fraction. Google Search Central public documentation still recommends JSON-LD as the preferred syntax for structured data, and several recent studies show this is also what GPTBot, ClaudeBot and OAI-SearchBot parse most reliably. An Ahrefs analysis published in March 2026 covering 1885 pages found that well-formed JSON-LD positively correlates with appearing in generative engine answers. The question is no longer should you ship Schema markup, but which types to prioritize. This article walks through the 7 types that produce measurable GEO effects, with ready-to-paste code examples and the typical traps to avoid.

Before diving into the 7 types, two structuring principles. First, LLMs do not read Schema.org the way Google does. Google uses rich results to enrich the SERP, LLMs use structured data as a trust and disambiguation signal. Second, JSON-LD has completely replaced Microdata and RDFa in modern usage: it is the format that GPTBot, ClaudeBot and AI engine crawlers parse reliably.

The Ahrefs March 2026 study shows that among 1885 pages analyzed, those appearing most often in ChatGPT citations combine three traits: clean semantic HTML, JSON-LD aligned with content, and established domain authority. Schema alone is not enough, but its absence hurts. To frame the broader approach, see the ScoreGeo methodology that weights these 13 criteria over 100 points.

1. Article and the subtypes that structure your publications

Article and its subtypes NewsArticle, BlogPosting and TechArticle form the editorial markup foundation. They give LLMs clean metadata on author, publication date, modification date and topical category. This is also what lets Perplexity and Google AI Overview cite the source correctly.

Properties not to miss: headline (max 110 chars, must match the h1), datePublished and dateModified in ISO 8601, author as a Person or Organization object with its own @id, and publisher with logo. The classic mistake is setting datePublished but forgetting dateModified after each update, which signals stale content.

Minimal viable example:

{ "@context": "https://schema.org", "@type": "Article", "headline": "Schema.org for GEO", "datePublished": "2026-06-03", "dateModified": "2026-06-03", "author": { "@type": "Organization", "name": "ScoreGeo" } }

2. FAQPage, the direct citation accelerator

FAQPage is the most profitable type for GEO in 2026. LLMs love compact question-answer pairs because they exactly match the answer-first format they return to users. A Semrush study of 150,000 ChatGPT citations confirms that pages with structured FAQs surface disproportionately as sources.

Three non-negotiable rules. Point 1: the question in the JSON-LD must match the visible question word for word. Point 2: the answer must be self-contained, readable out of context, ideally 100 to 300 characters. Point 3: only mark up real FAQs, never turn a paragraph into a fake Q&A. Google penalizes this, and LLMs do too: abusively marked FAQPage content disappears from citations.

To go beyond markup, also look at your answer-first strategy across the entire site, which amplifies the FAQ effect.

3. HowTo, the underused tutorial type

HowTo marks up step-by-step procedural content. Although Google restricted its SERP display in 2023, LLMs continue to actively exploit it for queries starting with how, how to or tutorial. The step + name + text + image structure almost mirrors how ChatGPT and Claude format their procedural answers.

Common traps: HowTo without numbered steps identifiable on the page, or with steps that are too generic. Each step.text must be a concrete, executable instruction, not a generality. If your tutorial discusses implementing Schema.org, write Paste the JSON-LD block into the page head, not Set up the structured data.

4. Product and Offer, essential for e-commerce and SaaS

Product structures markup for product pages, services and software subscriptions. For GEO in 2026, this type is central because LLMs increasingly answer commercial comparative queries like best tool for, alternative to, X vs Y. Without proper Product markup, you do not enter the short list.

Key properties: name, description, brand, offers with price and priceCurrency, aggregateRating if you have legitimate reviews, and sku. For B2B SaaS specifically, never invent aggregateRating, this is the most audited structured data fraud by Google and the riskiest. If you want to accelerate compliance, our GEO accompaniment package includes audit and Product markup correction across the entire catalog: see /accompagnement#form-sprint_geo.

5. Organization, the site identity card

Organization is underrated for GEO. It is what lets LLMs correctly associate your site with your brand, your domain, your social profiles and your legal entity. It is also the foundation of the knowledge graph ChatGPT, Claude and Gemini build in the background.

Place it on the homepage with a stable @id reusable everywhere else, then reference that @id in every Article.publisher, Product.brand and so on. Properties to complete: name, url, logo (as absolute URL), sameAs with official LinkedIn and Twitter profiles, and a short description. For a brand in early stages, this identity signal is exactly what helps LLMs not confuse you with a homonym.

6. BreadcrumbList, the navigation skeleton

BreadcrumbList is a discreet but useful type. It helps LLMs understand your site hierarchy, which improves citation extract relevance and perceived source reliability. Vercel and MERJ analyzed 500 million GPTBot fetches and confirm that AI crawlers actively follow structured navigation patterns.

Simple implementation: one BreadcrumbList per non-root page, with itemListElement containing position, name and item. Keep strict consistency between markup and the visible breadcrumb. A common inconsistency: breadcrumbs visible at two levels but JSON-LD at three levels, or the reverse. Avoid this.

7. Dataset, the differentiating asset for data sites

Dataset is rarely implemented, which is precisely why it is powerful in GEO. If your site publishes original numeric data, benchmarks or studies, marking these resources as Dataset creates strong topical authority. LLMs love citing sources that explicitly identify themselves as data producers.

Essential properties: name, description, creator (linked to your Organization @id), license (CC-BY or proprietary), distribution with URL and encodingFormat. Even for an editorial site, if you publish original charts or aggregated statistics, the Dataset type is relevant. It is a depth signal few competitors emit.

Schema.org types you can safely ignore for GEO

Almost everything else is noise for GEO in 2026. WebSite, WebPage, Thing, generic CreativeWork bring nothing measurable beyond what LLMs already infer from HTML. SiteNavigationElement, isolated ImageObject or contextless VideoObject do not trigger more citations. The discipline is to concentrate effort on the 7 types above rather than marking up everything for completeness sake.

One last critical point. Structured data must always exactly reflect the visible content. The rule Google Search Central has held since 2017 still applies and now extends to LLMs: markup that lies or exaggerates results in exclusion from sources. This is one of the most frequent GEO mistakes we see on otherwise well-structured sites.

Frequently asked questions

Which Schema.org format should I use, JSON-LD, Microdata or RDFa?

JSON-LD exclusively in 2026. Google Search Central has recommended it since 2015, and GPTBot, ClaudeBot and OAI-SearchBot parse JSON-LD reliably. Microdata and RDFa remain valid but are rapidly losing adoption and carry more implementation bug risks.

Should JSON-LD go in the head or the body?

Both work. Head is the historically recommended location, but Google and LLMs also accept JSON-LD injected at end of body, which helps with modern JavaScript frameworks like Next.js or Nuxt. Just avoid injecting it after a user event, since crawlers do not trigger those interactions.

Is FAQPage penalized now that Google reduced its SERP display?

No. Google reduced FAQ rich results display in 2023, but the markup is still indexed and remains valuable for GEO. LLMs actively use FAQPage to compose their answers, independently of Google display. It has even become a differentiating advantage since many sites removed their markup out of frustration.

How many Schema.org types can I add on a single page?

As many as relevant. A page can combine Article, BreadcrumbList, FAQPage and Organization without issue, either as separate JSON-LD blocks or within a single @graph array. Prefer @graph when entities cross-reference each other via @id.

Does Schema.org markup compensate for weak content?

No, never. Schema is a structure and trust signal, not a substitute for editorial quality. If your content is thin or imprecise, LLMs will not cite it more because it is well marked up. Schema amplifies good content, it does not create it.

How do I verify my JSON-LD is correctly parsed by LLMs?

Three cumulative free tools. Google Rich Results Test confirms Google-side validity. Schema.org Markup Validator checks strict standard conformance. For LLMs there is no official tool, but ScoreGeo audits priority type presence and consistency within its GEO scoring.

Should I mark up category and tag pages with a specific type?

CollectionPage is technically available, but its GEO impact is low. On those pages, prefer Organization plus BreadcrumbList plus, if relevant, an ItemList of included articles or products. Avoid marking these pages as Article, which would be bad practice.

Analyze my site free