Skip to content

Altor-lab/gen_llmtxt

Repository files navigation

AI SEO Optimizer - llms.txt Checker & Generator

A serverless AWS Lambda function that analyzes websites for AI optimization by checking, validating, and generating llms.txt files according to the llms.txt specification.

What It Does

This service helps optimize websites for AI model understanding and improves their ranking in AI-generated responses by:

  1. Checking if a website has an llms.txt file
  2. Validating existing llms.txt files for compliance and quality
  3. Generating properly formatted llms.txt files for websites that don't have one
  4. Providing actionable recommendations for AI SEO optimization

Architecture

┌─────────────┐
│   Client    │
└──────┬──────┘
       │ POST /analyze
       │ {"url": "https://example.com"}
       ▼
┌─────────────────────┐
│   API Gateway       │
└──────┬──────────────┘
       │
       ▼
┌─────────────────────┐
│  Lambda Function    │
│  ┌───────────────┐  │
│  │ llms.txt      │  │
│  │ Checker       │  │
│  └───────┬───────┘  │
│          │          │
│  ┌───────▼───────┐  │
│  │ Validator     │  │ (if exists)
│  └───────────────┘  │
│          │          │
│  ┌───────▼───────┐  │
│  │ Crawler       │  │ (if not exists)
│  └───────┬───────┘  │
│          │          │
│  ┌───────▼───────┐  │
│  │ Generator     │  │
│  └───────────────┘  │
└─────────┬───────────┘
          │
          ▼
    ┌─────────┐
    │ Response│
    └─────────┘

Features

llms.txt Checking

  • Automatically detects if a website has an llms.txt file
  • Checks the standard location (/llms.txt)
  • Reports file size and content type

Validation

  • Validates against the official llms.txt specification
  • Checks for required H1 title
  • Verifies blockquote summary
  • Validates H2 section structure
  • Checks markdown link formatting
  • Provides a quality score (0-100)
  • Lists specific issues and warnings

Generation

  • Crawls website to extract key information
  • Identifies documentation and navigation links
  • Creates spec-compliant llms.txt content
  • Includes proper markdown formatting
  • Adds optional sections for supplementary resources

Recommendations

  • Provides actionable improvement suggestions
  • Tailored advice based on validation results
  • Best practices for AI SEO optimization

Quick Start

Prerequisites

  • Python 3.11+

Installation

  1. Clone or download this project

  2. Install dependencies

pip install -r requirements.txt
  1. Test locally
python test_local.py

Deployment

chmod +x deploy.sh
./deploy.sh [stack-name] [region]

```powershell
.\deploy.ps1 -StackName "aiseo-llmstxt-optimizer" -Region "us-east-1"

Usage

API Request

Endpoint: POST /analyze

Request Body:

{
  "url": "https://example.com",
  "options": {
    "max_pages": 20,
    "timeout": 10
  }
}

Parameters:

  • url (required): The website URL to analyze
  • options (optional):
    • max_pages: Maximum pages to crawl (default: 20)
    • timeout: Request timeout in seconds (default: 10)

Response Format

When llms.txt EXISTS:

{
  "url": "https://example.com",
  "llms_txt_exists": true,
  "llms_txt_url": "https://example.com/llms.txt",
  "validation": {
    "valid": true,
    "score": 85,
    "issues": [],
    "warnings": ["Some H2 sections may not have properly formatted link lists"],
    "sections_found": ["Documentation", "Examples"],
    "links_count": 5,
    "has_optional_section": true
  },
  "current_llms_txt": "# Example Site\n> Description...",
  "recommendations": [
    "Consider adding more documentation links",
    "Ensure all linked pages have .md versions"
  ]
}

When llms.txt DOES NOT EXIST:

{
  "url": "https://example.com",
  "llms_txt_exists": false,
  "llms_txt_url": "https://example.com/llms.txt",
  "generated_llms_txt": "# Example Site\n> A great website...\n\n## Documentation\n- [Guide](https://example.com/guide)",
  "site_info": {
    "title": "Example Site",
    "description": "A great website for examples",
    "documentation_links_found": 3,
    "navigation_links_found": 8
  },
  "recommendations": [
    "Create an llms.txt file at the root of your website",
    "Include clear descriptions for each linked resource"
  ]
}

Example Usage

cURL:

curl -X POST https://your-api-gateway-url/analyze \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com"}'

Python:

import requests

response = requests.post(
    'https://your-api-gateway-url/analyze',
    json={'url': 'https://example.com'}
)

result = response.json()
print(result)

JavaScript:

fetch('https://your-api-gateway-url/analyze', {
  method: 'POST',
  headers: {'Content-Type': 'application/json'},
  body: JSON.stringify({url: 'https://example.com'})
})
.then(res => res.json())
.then(data => console.log(data));

Testing

Local Testing

python test_local.py

This runs multiple test scenarios:

  • Website with existing llms.txt
  • Website without llms.txt
  • Custom options
  • Invalid URL handling
  • Missing parameter handling

Manual Testing

python lambda_function.py

Configuration

Environment Variables

You can set these in template.yaml or serverless.yml:

  • LOG_LEVEL: Logging level (default: INFO)

Lambda Settings

  • Runtime: Python 3.11
  • Memory: 512 MB
  • Timeout: 30 seconds

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors