Methodology

ScoreGeo analyzes your URL server-side: we fetch your raw HTML, your /robots.txt and your /llms.txt, then evaluate 9 weighted criteria totaling 100 points.

The 9 criteria and their weights

20
Server rendering
We fetch the raw HTML without running JavaScript. We count visible words: a SPA whose content appears only after hydration is nearly invisible to LLMs.
20
Structured data (JSON-LD)
We extract every <script type="application/ld+json"> block (including recursive @graph). Bonus for types most exploited by AI: FAQPage, HowTo, LocalBusiness, Product.
10
Answer-first structure
We look for the first meaningful paragraph in <main>/<article>/after the H1. Sweet spot: 15-80 words, ≤ 600 characters. This is the format LLMs extract as a direct citation.
10
Content freshness
JSON-LD dateModified/datePublished + <time datetime> tags. A page less than 6 months old is valued, > 36 months penalized.
10
Semantic HTML
A single <h1>, coherent heading hierarchy (no jump h2→h4), a single <main>, presence of <article> or <section>.
10
SEO metadata
title (30-65 chars), meta description (70-160 chars), canonical, OpenGraph (title/description/image/type), Twitter Card. These signals also serve LLMs to summarize.
10
AI crawler access
Parsing of /robots.txt. We check access for the 6 major AI crawlers: GPTBot, ChatGPT-User, Google-Extended (Gemini & AI Overviews), ClaudeBot, PerplexityBot, CCBot.
5
Entity richness
Counters for numbers, lists, tables, definition lists, FAQ-style questions ("How…?", "Why…?"). The denser the content in entities, the more citable.
5
llms.txt file
Presence of a /llms.txt file at root (emerging standard proposed by Jeremy Howard) that points LLMs to your priority Markdown content.

Score calculation

Each criterion returns a sub-score normalized between 0 and 1, multiplied by its weight. The sum gives the global score on 100. A severity (passed / to improve / failed) is derived from the ratio obtained:

≥ 0.85 → passed
≥ 0.40 → to improve
< 0.40 → failed

Global rating

Excellent≥ 80
Good60-79
Average40-59
At risk< 40

Acknowledged limits

We read the HTML rendered without JavaScript — this is deliberate, this is what ChatGPT/Claude/Perplexity see during their crawl.
Heuristics are fast and readable, not an absolute truth. A site can technically have a poor score and still be widely cited (rare but possible).
The "are you cited by AI?" module measures actual brand citation by Claude — optional add-on activated by entering brand + sector + city, requires a server-side Anthropic key.

← Back to analysis

Methodology

The 9 criteria and their weights

Server rendering

Structured data (JSON-LD)

Answer-first structure

Content freshness

Semantic HTML

SEO metadata

AI crawler access

Entity richness

llms.txt file

Score calculation

Global rating

Acknowledged limits