HTML Accessibility
Checks whether AI search engines like ChatGPT and Claude can access your website content
Every audit runs 34 scored checks across 6 categories, measuring how well AI search engines can find, extract, and cite your content.
Can AI crawlers find and access your content?
Checks whether AI search engines like ChatGPT and Claude can access your website content
Checks whether AI bots can reach the page by parsing robots.txt, X-Robots-Tag, and meta robots.
Compares raw HTML vs browser-rendered HTML to measure JavaScript dependency.
Validates canonical tags, robots directives, and crawler access.
Checks whether critical GEO signals (H1, structured data, key passages, and entity mentions) appear within Google's 2 MB HTML processing window. Pages exceeding this limit risk having their most important AI-readable content cut off before crawlers finish parsing.
How many of your owned domains appear in Google results for your brand.
Is your content structured for AI extraction and citation?
Evaluates heading structure quality for AI extractability.
Checks whether content leads with conclusions (answer-first format).
Measures data-richness of content: facts per 100 words.
Checks whether paragraphs are citable in isolation.
Checks whether HTML converts cleanly to Markdown.
Measures the ratio of AI-visible content to total page content. Hero metric: 'AI sees X% of your page.'
Detects critical data trapped in HTML tables without prose restatement. Trafilatura degrades tables during extraction; jusText removes them outright. Tables without prose fallbacks are high-risk content under AI extraction pipelines.
Measures Flesch-Kincaid grade level and sentence complexity.
Measures content substance: word count, content ratio, section coverage.
Does the AI knowledge graph recognize and trust your brand?
Checks Wikidata entity presence and richness.
Evaluates Experience, Expertise, Authoritativeness, and Trustworthiness signals.
Checks consistency of brand entity references within the page (title, headings, body, meta). Does not verify cross-platform consistency.
Detects whether E-E-A-T signals (author credentials, expertise, affiliations) appear in article prose where AI extraction preserves them, or only in bio blocks that extraction pipelines discard (DR-097 Section 4c).
Measures whether internal links point to topically related content using anchor text and URL slug overlap as a proxy signal.
Queries major AI providers to check how a brand appears in AI-generated responses.
Is your content formatted so AI can cite it accurately?
Validates title tag, meta description, viewport, charset, and lang.
Validates structured data (JSON-LD, Microdata, RDFa) for AI system comprehension.
Validates Open Graph and Twitter Card tags.
Detects content age via dateModified and datePublished metadata.
Detects content density, vocabulary diversity, and repeated blocks.
Baseline technical signals that support AI visibility.
Evaluates internal/external link quality and anchor text.
Checks alt text coverage/quality, dimensions, and loading optimization.
Checks HTML-level performance: size, resource hints, blocking resources.
Validates HTTPS, HSTS, CSP, and other security headers. While not directly related to AI visibility, security headers contribute to overall site trust signals that search engines evaluate.
Validates URL length, slug quality, query params, path depth, and trailing slash consistency.
Evaluates accessibility signals: lang, landmarks, skip nav, form labels.
Detects HTTP redirect chains, type misuse, and collapsible normalization hops.
Checks FineWeb-style quality indicators that determine whether a page would survive AI training data filters. Evaluates terminal punctuation ratio, line length distribution, duplicate line ratio, and prose density.
Forward-looking signals tracked for informational value.
Measures the ratio of named entities (people, brands, products, places) in your content. Research across 18,000 verified AI citations found entity-dense content is selected at significantly higher rates than generic prose.
Measures where your citable signals (definitions, entities, data points) concentrate across your content. Research based on 3 million ChatGPT responses found 44.2% of citations reference the first 30% of a page.
Detects clear definitional patterns ('X is,' 'X refers to,' 'X is defined as') that AI systems can extract and attribute with confidence. Content with definitional language is cited at roughly twice the rate of content without it.
Estimates whether your content tone falls in the optimal range for AI citation. Research found content with balanced subjectivity (~0.47, similar to industry analysis) is cited more often than purely promotional or purely dry content.
Checks for an llms.txt file at your site root, the emerging standard for giving AI systems a structured guide to your key pages. Over 844,000 websites have adopted it, though no major AI provider has confirmed parsing it in production.
Checks whether your site provides complete documentation as a single AI-ingestion file. AI coding assistants like Cursor already read llms-full.txt to answer developer questions without crawling page by page.
Detects the CMS, framework, and key plugins powering the scanned page. Technology detection provides context for interpreting your other results rather than being a signal to optimize directly.
Enter any URL. Get your score in 60 seconds. Free.
Run an AI Visibility Screening