Making LLMs.txt work for Headless Websites

Author: Tom Cranstoun
There's a critical disconnect hiding in plain sight: AI systems can't see your JavaScript-rendered content. While humans experience rich, interactive websites, large language models encounter only empty shells—missing the very information they're designed to analyze. The llms.txt standard offers an innovative solution, but as a September 2024 proposal, most AI systems don't know it exists yet.

Discover how a creative blend of traditional HTML meta tags and modern specifications bridges this gap, ensuring your content remains visible in an AI-mediated world without waiting years for AI systems to catch up.

LLMS.txt, CMS and AI Integration - The Final Piece of the Puzzle

llms.txt: A New Standard for AI-Readable Content

What is llms.txt?

The llms.txt standard, proposed by Jeremy Howard in September 2024 https://llmstxt.org/#proposal , creates a standardized way for websites to communicate with Large Language Models (LLMs). Similar to how robots.txt guides search engines, llms.txt helps AI systems understand and navigate website content, providing curated information in markdown format that doesn't require JavaScript execution.

The Problem llms.txt Solves

When major AI systems scrape the web to build their knowledge bases, they face a critical limitation: they typically don't execute JavaScript. This creates a massive blind spot in their understanding of the modern internet.

What AI Scrapers See:

<!doctype html><html lang="en"><head> <meta charset="utf-8" />

<link rel="icon" href="/favicon.ico" /> <meta name="viewport" content="width=device-width,initial-scale=1" />
<meta name="theme-color" content="#000000" />
<meta name="description" content="Tourist Guide" />
<title>York Tourist Guide</title>
<script defer="defer" src="/static/js/main.ad20c640.js"></script>
<link href="/static/css/main.b85f4e74.css" rel="stylesheet"></head>
<body><noscript>You need to enable JavaScript to run this app.</noscript>
<div id="root"></div></body></html>

Just a shell with no meaningful content.

What Human Users See:

A beautiful presentation about York tourism with:

All this content is generated by JavaScript after the initial page load—completely invisible to most AI scrapers.

Key Advantages of the llms.txt Standard

  1. Bypasses JavaScript Limitations: Provides clean, static markdown content that doesn't require JavaScript execution.
  2. Content Curation: Allows website owners to curate the most relevant information.
  3. Context Window Optimization: Addresses the reality that most websites are too large to fit in AI context windows.
  4. Human and Machine Readable: Uses markdown format that's easily understood by both humans and machines.
  5. Structured Yet Flexible: Follows a consistent format while allowing customization for different website types.
  6. Compatible with Existing Standards: Works alongside robots.txt and sitemap.xml rather than replacing them.
  7. Explicit Prioritization: The "Optional" section helps AIs make smart decisions when context space is limited.
  8. Supports External References: Can include URLs to external sites that provide helpful context.
  9. Resource-Efficient: Doesn't require the computational resources needed to run headless browsers.
  10. Inference-Time Optimization: Helps AI systems access information at the moment a user needs it.
  11. Developer-Friendly: Easy to implement with growing ecosystem support.
  12. Consistent Processing: Enables predictable parsing through a standardized format.
  13. Comprehensive Guidance: Includes access control rules, attribution requirements, and privacy considerations.
  14. Version Control Support: Allows websites to maintain a clear history of AI interaction policies.

Implementation Example: York Tourism Website

# York Tourism Example Guide

> Website for Tourism in York, providing information about historical attractions, accommodations, events, and visitor services for tourists visiting York, England.

> Last updated: April 2025. Contact: info@yorktourism-example.com

Site Type: Content-Driven, Informational
Purpose: Tourism Information and Services
Technology Stack: JavaScript-based SPA with dynamic content loading

## Access Guidelines

- Base Rate: 100 requests per hour per IP
- Maximum 10 requests per minute
- Cache for maximum 24 hours
- Commercial use requires permission via partnerships@yorktourism-example.com
- Attribution format: "Source: York Tourism (yorktourism-example.com)"

## Primary Attractions
- [York Minster](https://yorktourism-example.com/attractions/york-minster.md): A magnificent Gothic cathedral with centuries of history and breathtaking architecture, completed in the 13th century.
- [The Shambles](https://yorktourism-example.com/attractions/shambles.md): A picturesque medieval street with overhanging buildings and quaint shops, once home to butchers but now housing boutique stores.
- [York City Walls](https://yorktourism-example.com/attractions/city-walls.md): Ancient fortifications circling the city, offering scenic walks and historical insights dating back to Roman times.
- [Jorvik Viking Centre](https://yorktourism-example.com/attractions/jorvik.md): An interactive museum showcasing York's Viking heritage through reconstructions and artifacts discovered during the Coppergate excavations.
- [National Railway Museum](https://yorktourism-example.com/attractions/railway-museum.md): A world-class museum celebrating Britain's railway heritage with an impressive collection of locomotives and carriages, including the Mallard and Flying Scotsman.

## Planning Your Visit

- [Events Calendar](https://yorktourism-example.com/events/calendar.md): Comprehensive listing of festivals, exhibitions, and events happening in York throughout the year.
- [Accommodation Guide](https://yorktourism-example.com/stay/accommodations.md): Listings and reviews of hotels, B&Bs, and self-catering options across York.
- [Travel Information](https://yorktourism-example.com/visit/travel.md): Details on reaching York by train, car, or bus, and navigating the city upon arrival.
## Optional
- [Dining Guide](https://yorktourism-example.com/food/restaurants.md): Reviews and recommendations for restaurants, cafes, and pubs in York.
- [Shopping Directory](https://yorktourism-example.com/shop/directory.md): Information about shopping districts, markets, and unique local stores.
- [Guided Tours](https://yorktourism-example.com/tours/guided.md): Details about walking tours, ghost tours, and specialized guided experiences.
- [Visitor Reviews](https://yorktourism-example.com/reviews/visitor-experiences.md): Testimonials and experiences shared by previous visitors to York.
- [History of York](https://yorktourism-example.com/about/history.md): Detailed historical timeline covering Roman, Viking, Medieval, and Victorian York.

## Error Handling
If documentation is unavailable:
1. Cache error details including timestamp
2. Wait 30 minutes before retrying
3. Check status at status.yorktourism-example.com
4. Alternative resource: visitbritain.com/york

## For Human Visitors
Looking for our main website? You'll find it at yorktourism-example.com

Need help planning your visit? Contact our visitor center at info@yorktourism-example.com

Curious about this file? Learn more about how AI uses this information at llmstxt.org

What's in an llms.txt File

Required Elements

  1. Title (H1): Must be the first element and should clearly identify your site or organization
  2. Summary (Blockquote): A concise description of your site and content with key information

Optional Sections

  1. Documentation Links (H2): Primary documentation and resources in the format [Link Name](URL): Brief description
  2. Optional Links (H2): Secondary resources that can be skipped if context is limited

Implementation Guidelines

Bridging the Gap: HTML Meta Tags for llms.txt

To accelerate adoption and address the headless browsing challenge, HTML meta tags can be implemented:

<meta name="llms-txt" content="llms.txt">
<meta name="llms-txt-description" content="A Website for York Tourism, providing information about historical attractions, accommodations, events, and visitor services for tourists visiting York, with links">

The entire llms.txt content could be embedded directly in the HTML using a meta tag llms-txt-content:

<!-  lets insert llm.txt ->
<meta name="llms-txt-content" content="# York Tourism Example Guide, > A Website for York Tourism... [rest of content]">

This approach ensures that:

  1. Even without JavaScript execution, AI systems can extract meaningful content
  2. The full structural information is available directly in the HTML
  3. Traditional web crawlers would pick up the essential content
  4. The organization's important information is never hidden behind JavaScript barriers

CMS and AI Integration

For llms.txt to achieve widespread adoption, content management systems (CMS) must be prepared to automatically generate the corresponding .md versions of content pages and provide support for integrating llms.txt into their workflows.

Adobe Edge Delivery Services (EDS) doesn't automatically create the llms.txt file itself, but it does provide critical supporting infrastructure that makes implementation significantly easier:

The AI Awareness Gap

It's important to understand that llms.txt is a newly proposed standard (introduced in September 2024), and most public AI systems are currently unaware of its existence. The knowledge cutoffs for today's leading LLMs typically predate this proposal.

This highlights challenges in the AI ecosystem:

  1. Catch-22 of adoption: AI systems won't recommend implementing llms.txt because they don't know about it
  2. Missing implementation knowledge: Future AI systems may lack practical implementation examples
  3. Innovation bottleneck: Delays between standard creation and AI awareness inhibit creative applications
  4. Feedback loop disruption: Knowledge cutoffs prevent vital feedback mechanisms

Real-World Applications

The llms.txt standard could dramatically improve AI understanding across various website types:

The Road Ahead

As the proposal gains traction, we're seeing growing directories of llms.txt files. For website owners, implementing this standard represents a low-effort way to make their content more accessible to the next generation of AI systems.

Here are a few directories that list the llms.txt files available on the web:

In case you want to see the original SPA with meta https://allabout.network/blogs/ddt/five-things-to-do-do-in-york

<hr>

Thank you for reading

/fragments/ddt/ai-proposition


Related Articles

path=*
Back to Top