ScoreGeo

Methodology

ScoreGeo analyzes your URL server-side: we fetch your raw HTML, your /robots.txt and your /llms.txt, then evaluate 9 weighted criteria totaling 100 points.

The 9 criteria and their weights

  • 20

    Server rendering

    We fetch the raw HTML without running JavaScript. We count visible words: a SPA whose content appears only after hydration is nearly invisible to LLMs.

  • 20

    Structured data (JSON-LD)

    We extract every <script type="application/ld+json"> block (including recursive @graph). Bonus for types most exploited by AI: FAQPage, HowTo, LocalBusiness, Product.

  • 10

    Answer-first structure

    We look for the first meaningful paragraph in <main>/<article>/after the H1. Sweet spot: 15-80 words, ≤ 600 characters. This is the format LLMs extract as a direct citation.

  • 10

    Content freshness

    JSON-LD dateModified/datePublished + <time datetime> tags. A page less than 6 months old is valued, > 36 months penalized.

  • 10

    Semantic HTML

    A single <h1>, coherent heading hierarchy (no jump h2→h4), a single <main>, presence of <article> or <section>.

  • 10

    SEO metadata

    title (30-65 chars), meta description (70-160 chars), canonical, OpenGraph (title/description/image/type), Twitter Card. These signals also serve LLMs to summarize.

  • 10

    AI crawler access

    Parsing of /robots.txt. We check access for the 6 major AI crawlers: GPTBot, ChatGPT-User, Google-Extended (Gemini & AI Overviews), ClaudeBot, PerplexityBot, CCBot.

  • 5

    Entity richness

    Counters for numbers, lists, tables, definition lists, FAQ-style questions ("How…?", "Why…?"). The denser the content in entities, the more citable.

  • 5

    llms.txt file

    Presence of a /llms.txt file at root (emerging standard proposed by Jeremy Howard) that points LLMs to your priority Markdown content.

Score calculation

Each criterion returns a sub-score normalized between 0 and 1, multiplied by its weight. The sum gives the global score on 100. A severity (passed / to improve / failed) is derived from the ratio obtained:

  • ≥ 0.85 → passed
  • ≥ 0.40 → to improve
  • < 0.40 → failed

Global rating

  • Excellent≥ 80
  • Good60-79
  • Average40-59
  • At risk< 40

Acknowledged limits

  • We read the HTML rendered without JavaScript — this is deliberate, this is what ChatGPT/Claude/Perplexity see during their crawl.
  • Heuristics are fast and readable, not an absolute truth. A site can technically have a poor score and still be widely cited (rare but possible).
  • The "are you cited by AI?" module measures actual brand citation by Claude — optional add-on activated by entering brand + sector + city, requires a server-side Anthropic key.

← Back to analysis