Skip to content

duyet/koa-isbot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

193 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

koa-isbot

Node.js CI TypeScript License: MIT

Modern Koa middleware for intelligent bot detection using the industry-standard isbot library. Detect search engines, crawlers, AI bots (ChatGPT, Claude, Perplexity), and thousands of other bots with TypeScript support and zero configuration.

โœจ Features

  • ๐Ÿค– Comprehensive Bot Detection - Detects thousands of bots including Google, Bing, ChatGPT, Claude, Perplexity, and more
  • ๐Ÿ“ฆ Zero Config - Works out of the box with sensible defaults
  • ๐ŸŽฏ TypeScript First - Full type safety and IntelliSense support
  • โšก High Performance - Built-in caching and optimizations
  • ๐Ÿ”ง Highly Customizable - Extend detection with custom patterns
  • ๐Ÿ›ก๏ธ Security Hardened - Input validation and DoS protection
  • ๐Ÿ“Š Analytics Friendly - Callbacks for tracking and logging
  • ๐ŸŒณ Tree-shakeable - ESM and CJS support with optimal bundling
  • โœ… Battle Tested - 100% test coverage with Vitest

๐Ÿ“ฅ Installation

npm install @duyetdev/koa-isbot

๐Ÿš€ Quick Start

import Koa from 'koa';
import { koaIsBot } from '@duyetdev/koa-isbot';

const app = new Koa();

// Add the middleware
app.use(koaIsBot());

// Use bot detection in your routes
app.use((ctx) => {
  if (ctx.state.isBot?.isBot) {
    console.log(`Bot detected: ${ctx.state.isBot.botName}`);
    // Serve static pre-rendered content
  } else {
    // Serve dynamic content
  }
});

app.listen(3000);

๐Ÿ“– API Documentation

koaIsBot(options?)

Creates a Koa middleware for bot detection.

Options

interface KoaIsBotOptions {
  /**
   * Custom bot patterns to add to detection
   * @example ['mybot', /customcrawler/i]
   */
  customPatterns?: (string | RegExp)[];

  /**
   * Patterns to exclude from bot detection
   * @example ['chrome-lighthouse']
   */
  excludePatterns?: string[];

  /**
   * Where to store the detection result in Koa context
   * @default 'isBot'
   */
  stateKey?: string;

  /**
   * Enable result caching for performance
   * @default true
   */
  cache?: boolean;

  /**
   * Maximum cache size (number of entries)
   * @default 1000
   */
  cacheSize?: number;

  /**
   * Cache TTL in milliseconds
   * @default 3600000 (1 hour)
   */
  cacheTTL?: number;

  /**
   * Callback when a bot is detected
   */
  onBotDetected?: (ctx: Context, result: BotDetectionResult) => void | Promise<void>;

  /**
   * Callback for all requests
   */
  onDetection?: (ctx: Context, result: BotDetectionResult) => void | Promise<void>;

  /**
   * Custom user agent extraction function
   * @default (ctx) => ctx.request.headers['user-agent']
   */
  getUserAgent?: (ctx: Context) => string | undefined;
}

Detection Result

The bot detection result is stored in ctx.state.isBot (or custom stateKey):

interface BotDetectionResult {
  /** Whether a bot was detected */
  isBot: boolean;

  /** Name of the detected bot */
  botName: string | null;

  /** All detected bot patterns */
  botPatterns: string[];

  /** The user agent string analyzed */
  userAgent: string;
}

๐Ÿ’ก Usage Examples

Basic Usage

import Koa from 'koa';
import { koaIsBot } from '@duyetdev/koa-isbot';

const app = new Koa();
app.use(koaIsBot());

app.use((ctx) => {
  const { isBot, botName } = ctx.state.isBot;

  ctx.body = isBot
    ? `Hello ${botName}! Here's your optimized content.`
    : 'Hello human! Welcome to our site.';
});

Custom Bot Patterns

Add detection for your own bots or crawlers:

app.use(koaIsBot({
  customPatterns: [
    'my-internal-bot',
    /company-crawler/i,
    'monitoring-service'
  ]
}));

Exclude Specific Bots

Exclude certain bots from detection (e.g., Lighthouse for performance testing):

app.use(koaIsBot({
  excludePatterns: ['chrome-lighthouse']
}));

Analytics & Logging

Track bot visits for analytics:

app.use(koaIsBot({
  onBotDetected: async (ctx, result) => {
    console.log(`๐Ÿค– Bot visit: ${result.botName} on ${ctx.path}`);

    // Send to analytics
    await analytics.track('bot_visit', {
      bot: result.botName,
      path: ctx.path,
      timestamp: new Date()
    });
  },

  onDetection: async (ctx, result) => {
    // Track all requests (bots and humans)
    await analytics.track('page_view', {
      isBot: result.isBot,
      path: ctx.path
    });
  }
}));

Custom State Key

Store the result in a custom location:

app.use(koaIsBot({ stateKey: 'bot' }));

app.use((ctx) => {
  if (ctx.state.bot?.isBot) {
    // Access via custom key
  }
});

Performance Optimization

Configure caching for high-traffic applications:

app.use(koaIsBot({
  cache: true,
  cacheSize: 5000,    // Store up to 5000 unique user agents
  cacheTTL: 7200000   // 2 hours cache lifetime
}));

SEO Optimization

Serve pre-rendered content to search engines:

app.use(koaIsBot());

app.use(async (ctx) => {
  if (ctx.state.isBot?.isBot) {
    // Serve pre-rendered HTML for SEO
    ctx.body = await prerenderService.getPage(ctx.path);
    ctx.set('Cache-Control', 'public, max-age=3600');
  } else {
    // Serve SPA for humans
    ctx.body = await fs.readFile('index.html');
  }
});

Custom User Agent Extraction

Extract user agent from custom headers (useful with proxies):

app.use(koaIsBot({
  getUserAgent: (ctx) => {
    // Check custom header from proxy
    return ctx.get('X-Original-User-Agent') ||
           ctx.get('User-Agent') ||
           '';
  }
}));

๐Ÿค– Detected Bots

This middleware uses isbot which detects thousands of bots including:

Search Engines:

  • Googlebot, Bingbot, Yandex, Baidu, DuckDuckBot
  • Yahoo Slurp, Exabot, and many more

AI Assistants & Crawlers:

  • ChatGPT-User, GPTBot (OpenAI)
  • Claude-Web (Anthropic)
  • PerplexityBot
  • Applebot, Facebookbot

Social Media:

  • TwitterBot, LinkedInBot, TelegramBot
  • SlackBot, PinterestBot, WhatsApp

Development & Monitoring:

  • Postman, curl, wget
  • Pingdom, UptimeRobot, StatusCake
  • Chrome Lighthouse (optional)

And thousands more! See the full list.

๐Ÿ”„ Migration from v0.1.x

Version 2.0 is a complete rewrite with breaking changes. Here's how to migrate:

Before (v0.1.x)

const isBot = require('koa-isbot');

app.use(isBot());

app.use(async (ctx, next) => {
  console.log(ctx.isBot); // 'googlebot' or null
});

After (v2.0)

import { koaIsBot } from '@duyetdev/koa-isbot';

app.use(koaIsBot());

app.use((ctx) => {
  console.log(ctx.state.isBot); // BotDetectionResult object
  // {
  //   isBot: true,
  //   botName: 'googlebot',
  //   botPatterns: ['googlebot'],
  //   userAgent: '...'
  // }
});

Key Changes

  • ES Modules: Now uses ESM (with CJS support)
  • TypeScript: Full TypeScript support with types
  • Result Location: Moved from ctx.isBot to ctx.state.isBot
  • Result Format: Now returns detailed object instead of string/null
  • Detection: Uses isbot library (thousands of patterns vs. 15)
  • Features: Added caching, callbacks, customization

๐Ÿ—๏ธ Advanced Example

See examples/advanced.ts for a complete example with:

  • Custom patterns and exclusions
  • Analytics integration
  • Performance monitoring
  • Different content serving
  • Error handling
# Run the advanced example
npm install
npm run build
node examples/advanced.ts

๐Ÿงช Testing

# Run tests
npm test

# Run tests with coverage
npm run test:coverage

# Run tests in watch mode
npm run test:watch

๐Ÿ› ๏ธ Development

# Install dependencies
npm install

# Build the project
npm run build

# Type check
npm run type-check

# Lint
npm run lint

# Format code
npm run format

๐Ÿ“Š Performance

  • Cached requests: < 1ms overhead
  • Uncached requests: < 5ms overhead
  • Memory: Automatic LRU cache eviction
  • Security: Input sanitization and length limits

๐Ÿ”’ Security

  • User agent strings are limited to 2048 characters to prevent DoS
  • Input is sanitized before regex matching
  • Cache size is bounded with LRU eviction
  • No eval() or unsafe operations

๐Ÿค Contributing

Contributions are welcome! Please see CONTRIBUTING.md for details.

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes with tests
  4. Ensure all tests pass (npm test)
  5. Submit a pull request

๐Ÿ“ License

MIT License - Copyright (c) 2016-2025 Van-Duyet Le

See LICENSE for details.

๐Ÿ™ Acknowledgments

  • Built on top of isbot by Omri Lotan
  • Inspired by the original koa-isbot concept
  • TypeScript types for Koa

๐Ÿ“ฎ Support

๐Ÿ”— Related Projects

  • isbot - The underlying bot detection library
  • Koa - Next generation web framework for Node.js
  • koa-useragent - User agent parser for Koa

Made with โค๏ธ by Van-Duyet Le

About

Fast Middleware detect bot crawler for Koa (googlebot, bingbot, facebookbot,...).

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors