claw-auto-router

A self-hosted, OpenAI-compatible LLM router for OpenClaw — automatically imports your provider/model configuration and routes each request to the best available model.

Primary use case: Discord → OpenClaw → claw-auto-router → best provider/model

Why this exists

OpenClaw lets you configure multiple LLM providers and run agents across them. But when you want a single "smart" endpoint that automatically picks the best model for each request — without duplicating configuration — you need a router.

claw-auto-router:

Reads your existing OpenClaw config (zero duplication)
Exposes an OpenAI-compatible API so OpenClaw treats it like a normal provider
Routes requests to the most suitable model based on content (tier-based heuristics or optional RouterAI classification + explicit assignments)
Falls back automatically when a provider fails
Tracks routing stats, estimated spend/savings, and active session overrides in a live dashboard
Lets users switch models or tiers mid-conversation with natural-language commands such as use opus or prefer code
Supports OpenClaw-native thinking overrides
Delegates all model calls back through the OpenClaw Gateway instead of reimplementing provider OAuth here

Architecture

flowchart TD
    A[Incoming Request] --> B{Model ID?}
    B -->|simple / medium / complex / reasoning| C[Forced Tier]
    B -->|auto| D[Heuristic Classifier\nor RouterAI]
    C --> E[Assigned Tier]
    D --> E
    E --> F[Pick Best Model for Tier]
    F --> G[Fallback Proxy]
    G --> H[OpenClaw Gateway]
    H --> I[LLM Provider]
    I -->|Stream| H
    H -->|Stream| G
    G -->|Stream| A

sequenceDiagram
    participant Discord
    participant OpenClaw as OpenClaw (Gateway)
    participant Router as claw-auto-router (port 43123)

    Discord->>OpenClaw: Send message
    OpenClaw->>Router: POST /v1/chat/completions
    Router->>Router: Classify prompt → pick tier & model
    Router->>OpenClaw: Forward to best model via Gateway
    OpenClaw-->>Router: Stream response
    Router-->>OpenClaw: Pipe response
    OpenClaw-->>Discord: Display response

Routing tiers

Each request is classified into one of four tiers. By default this is done with deterministic heuristics; during claw-auto-router setup you can optionally enable RouterAI, which asks a dedicated model to choose the tier before routing. claw-auto-router then picks the best model for that tier:

Tier	Triggers	Preferred model traits
CODE	code fences, "implement/debug/refactor/function/class"	reasoning models, coders
COMPLEX	analysis keywords, messages > 2000 tokens	large context, reasoning
SIMPLE	short greetings, simple Q&A, < 200 tokens	fast, cheap
STANDARD	everything else	config order

Explicit tier assignments (set via claw-auto-router setup or router.config.json) always override automatic scoring.

Heuristics vs RouterAI:

Heuristic is faster, deterministic, and adds no extra model call. This is the safest default.
RouterAI can do better on ambiguous prompts, but every auto-routed request pays for one small classifier call first.
If RouterAI fails, claw-auto-router automatically falls back to heuristics for that request.

Setup wizard

During claw-auto-router setup, claw-auto-router prompts you to classify any model that lacks a tier assignment:

┌──────────────────────────────────────────────────────────────┐
│              claw-auto-router — Model Tier Setup Wizard                 │
│  Assign each model to its best routing tier.                 │
│  Press Enter or type 5 to skip (heuristics decide).          │
└──────────────────────────────────────────────────────────────┘

  Model    : Kimi for Coding
  ID       : kimi-coding/k2p5
  Context  : 256k tokens   Reasoning: yes ✓

    1) SIMPLE     Fast, cheap — quick Q&A, one-liners, lookups
    2) STANDARD   General purpose — default routing for most tasks
    3) COMPLEX    Large context, deep reasoning — analysis, long docs
    4) CODE       Code generation, debugging, refactoring, PRs
    5) Skip  — use auto-heuristics

  Choice [1-5, Enter=skip]: 4
  ✓ Assigned to CODE

Assignments are saved to ~/.openclaw/router.config.json by default (or next to the config file you target with --config) and take effect immediately.

The setup wizard also asks whether you want to keep deterministic heuristics or enable RouterAI, and if you choose RouterAI it lets you pick the classifier model to use.

How OpenClaw config is imported

Config discovery order:

OPENCLAW_CONFIG_PATH env var
~/.openclaw/openclaw.json
~/.openclaw/moltbot.json

From the config it extracts:

models.providers.* — base URLs, API styles, model definitions
openclaw models list --json and {agentDir}/models.json — built-in provider/model registry (OpenRouter, GitHub Copilot, OpenAI Codex, MiniMax Portal, Google Antigravity, etc.)
openclaw models list --json — full model catalog with context window and capability metadata
agents.defaults.model.primary — top-priority model
agents.defaults.model.fallbacks — fallback chain order
agents.defaults.models.* — aliases

Execution path:

All providers run through the OpenClaw Gateway with a provider/model override
Built-in and OAuth-backed providers like OpenRouter, GitHub Copilot, OpenAI Codex, MiniMax Portal, Qwen Portal, and Google Antigravity stay on OpenClaw's auth/runtime path

API key resolution

Source	Resolution
Literal key in config	Used directly
`"xxx-oauth"` sentinel	Checks `{PROVIDER}_TOKEN` env var (e.g. `QWEN_PORTAL_TOKEN`)
No key in config	Checks `{PROVIDER_ID_UPPER}_API_KEY` env var
Not resolvable	Hidden from routing pool

Visibility rules:

Models appear in /v1/models when the OpenClaw Gateway is reachable
If the Gateway is down, models are hidden until it comes back

Current caveats for OpenClaw-backed execution:

Standard chat requests now flow through OpenClaw Gateway's OpenAI-compatible HTTP API, so messages, temperature, max_tokens, and SSE streaming stay on the native OpenClaw path
Conversation-level thinking overrides still fall back to the Gateway agent bridge until OpenClaw's HTTP chat endpoint exposes the same per-request thinking controls
Agent-bridge fallback still only supports data:image/...;base64,... URLs from the latest user turn

Quick start

Pick the path that matches your setup:

Use npm if you already have Node.js 20+
Use Docker if you do not want to install Node.js

Easiest install: npm

If you already use Node.js, the best install UX is a single npm command.

Install from npm

npm install -g claw-auto-router
claw-auto-router setup

Make sure your OpenClaw Gateway is running before you expect imported models to route:

openclaw gateway status

claw-auto-router setup automatically:

detects your active OpenClaw config via openclaw config file
imports the current OpenClaw model catalog, including built-in configured providers like OpenRouter, GitHub Copilot, OpenAI Codex, MiniMax Portal, and Google Antigravity
asks you to assign tiers to your current models
shows the current order inside each tier and lets you save explicit priority overrides
asks whether routing decisions should stay heuristic or use RouterAI, and lets you pick the classifier model
writes ~/.openclaw/router.config.json
updates your OpenClaw config to point claw-auto-router/auto at the local router
ensures gateway.http.endpoints.chatCompletions.enabled=true so the router can use OpenClaw's native OpenAI-compatible Gateway path
on macOS, installs and starts a launchd background service automatically

If you want to throw away previous claw-auto-router tier assignments and rebuild them from scratch, use:

claw-auto-router clean-setup

It also installs a short alias:

clawr

Useful examples:

# Use an explicit OpenClaw config path
claw-auto-router setup --config ~/.openclaw/moltbot.json

# Rebuild existing claw-auto-router setup from scratch
claw-auto-router clean-setup

# Use a custom router port during setup
claw-auto-router setup --port 3001

# Check the background service on macOS
claw-auto-router service status

# Start or restart the background service manually
claw-auto-router service start
claw-auto-router service restart

See recent routing decisions and why they were chosen:

claw-auto-router logs --limit 20
claw-auto-router logs --json

Open the live dashboard:

open http://127.0.0.1:43123/dashboard

Background service management on macOS:

claw-auto-router service install
claw-auto-router service status
claw-auto-router service stop
claw-auto-router service uninstall

If you want the latest unreleased version straight from GitHub instead:

npm install -g github:yuga-hashimoto/claw-auto-router
claw-auto-router setup
claw-auto-router

No-Node install: Docker Compose

If you want clawr running without installing Node.js locally, use Docker.

What you need

Docker Desktop or Docker Engine + Docker Compose
Your OpenClaw config at ~/.openclaw/openclaw.json or ~/.openclaw/moltbot.json
Provider API keys only if they are not already stored in your OpenClaw config

1. Clone and start

git clone https://github.com/yuga-hashimoto/claw-auto-router.git
cd claw-auto-router
cp .env.example .env
docker compose up --build -d

docker-compose.yml mounts ~/.openclaw read-only and loads values from your local .env file automatically.

2. Add keys only if needed

Open .env and fill in only the provider keys that are missing from your OpenClaw config:

ZAI_API_KEY=
KIMI_CODING_API_KEY=
GOOGLE_API_KEY=
OPENROUTER_API_KEY=
NVIDIA_API_KEY=
QWEN_PORTAL_TOKEN=

Then restart:

docker compose restart

3. Verify it is up

curl http://localhost:43123/health
curl http://localhost:43123/v1/models

If /v1/models returns an empty list:

start or fix the OpenClaw Gateway for imported models
then reload the router config with POST /reload-config or restart the router service

Local install: Node.js + pnpm

Use this if you want local development, hot reload, or to modify the code.

What you need

Node.js 20+
pnpm
Your OpenClaw config at ~/.openclaw/openclaw.json or ~/.openclaw/moltbot.json

# Install dependencies
pnpm install

# Build the CLI once
pnpm build

# Run one-time setup against your OpenClaw config
pnpm start -- setup

# Or run the server directly during development
pnpm dev

The router starts on http://localhost:43123 and reads your OpenClaw config automatically. On macOS, setup also installs a launchd agent so the router can keep running in the background after setup.

For a production-style local run:

pnpm install        # also builds dist/ via prepare hook
pnpm start -- setup
pnpm start

pnpm dev          # Dev server with hot reload
pnpm build        # Compile TypeScript
pnpm start        # Run compiled output
pnpm test         # Run all tests
pnpm typecheck    # Type-check

`router.config.json`

Optional claw-auto-router-specific settings.

Default path:

~/.openclaw/router.config.json
If you run claw-auto-router setup --config /path/to/openclaw.json, it writes /path/to/router.config.json

Example:

{
  "modelTiers": {
    "kimi-coding/k2p5": "CODE",
    "nvidia/qwen/qwen3.5-397b-a17b": "COMPLEX",
    "google/gemini-flash": "SIMPLE"
  },
  "tierPriority": {
    "CODE": ["kimi-coding/k2p5", "nvidia/qwen/qwen3.5-397b-a17b"],
    "SIMPLE": ["google/gemini-flash"]
  },
  "routerAI": {
    "mode": "ai",
    "model": "google/gemini-3-flash-preview",
    "timeoutMs": 8000
  },
  "dashboard": {
    "baselineModel": "openai-codex/gpt-5.4",
    "refreshSeconds": 5
  },
  "denylist": ["some-provider/bad-model"]
}

Field	Description
`modelTiers`	Explicit tier per model — overrides heuristic scoring. Set by setup wizard.
`tierPriority`	Preferred model order within each tier (explicit beats score). Setup wizard can write this too.
`routerAI`	Optional AI classifier for tier decisions. If it fails, routing falls back to heuristics automatically.
`dashboard`	Baseline model + refresh interval for `/dashboard` estimated spend/savings.
`denylist`	Models to exclude from routing

claw-auto-router setup also writes openClawIntegration metadata here so the router can remember your original OpenClaw primary/fallback chain without routing to itself.

Conversation controls

You can change routing in the middle of a conversation by sending a short user message. When claw-auto-router can identify a stable session (session_id, user, x-session-id, x-openclaw-thread-id, or a derived conversation fingerprint), the override is saved for that conversation until you clear it.

Examples:

use opus
use gpt-5.4
use auto again
prefer code
clear tier
thinking high
thinking off
reset routing

What these do:

use opus / use gpt-5.4 locks that conversation to a specific model
use auto again returns to normal auto-routing
prefer code forces the CODE tier for that conversation
thinking high enables a conversation-level thinking override
reset routing clears all conversation overrides at once

Current thinking support:

thinking high / thinking medium / thinking low are forwarded to OpenClaw as gateway thinking-level overrides
reasoning_effort is mapped onto the same OpenClaw thinking levels
Budget and interleaved hints are normalized to the closest OpenClaw level before dispatch
Standard OpenAI-style generation controls such as temperature and max_tokens are forwarded through the OpenClaw Gateway HTTP path
Requests with thinking overrides currently use the Gateway agent bridge so OpenClaw can apply the requested thinking level

API reference

`POST /v1/chat/completions`

OpenAI-compatible chat completions.

# Auto-routing
curl -X POST http://localhost:43123/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"auto","messages":[{"role":"user","content":"Hello"}]}'

# Explicit model
curl -X POST http://localhost:43123/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"nvidia/qwen/qwen3.5-397b-a17b","messages":[{"role":"user","content":"Explain neural networks"}]}'

`GET /v1/models`

Returns all models with resolved API keys.

`GET /health`

Liveness check with model counts.

`GET /stats`

Routing stats: requests, per-model counts, fallback rate, classifier modes, active session overrides, config status, and estimated spend/savings when model pricing is known.

`GET /dashboard`

Live HTML dashboard on top of /stats.

Request volume, success rate, fallback rate
Estimated spend and savings versus a baseline model
Tier and classifier distribution
Per-model usage and recent routing history
Active conversation overrides

`POST /reload-config`

Reload OpenClaw config without restart. Atomically replaces the routing pool.

curl -X POST http://localhost:43123/reload-config

# With admin token:
curl -X POST http://localhost:43123/reload-config \
  -H "Authorization: Bearer your-token"

Pointing OpenClaw at claw-auto-router

If you use claw-auto-router setup, you do not need to edit OpenClaw manually.

Add to your moltbot.json (or openclaw.json):

{
  "models": {
    "providers": {
      "claw-auto-router": {
        "baseUrl": "http://localhost:43123",
        "apiKey": "any-value",
        "api": "openai-completions",
        "models": [
          {
            "id": "auto",
            "name": "Auto Router",
            "api": "openai-completions",
            "contextWindow": 262144,
            "maxTokens": 32768
          }
        ]
      }
    }
  },
  "agents": {
    "defaults": {
      "model": {
        "primary": "claw-auto-router/auto"
      }
    }
  }
}

Set your agent model to claw-auto-router/auto. OpenClaw sends chat completions to claw-auto-router, which routes to the actual best model internally.

Environment variables

Variable	Default	Description
`PORT`	`43123`	HTTP port
`HOST`	`0.0.0.0`	Bind address
`LOG_LEVEL`	`info`	`trace\|debug\|info\|warn\|error`
`OPENCLAW_CONFIG_PATH`	auto-detect	Override config path
`ROUTER_REQUEST_TIMEOUT_MS`	`30000`	Per-provider timeout (ms)
`ROUTER_ADMIN_TOKEN`	(none)	Token for `/reload-config`
`ZAI_API_KEY`	(none)	zai provider key
`KIMI_CODING_API_KEY`	(none)	kimi-coding provider key
`GOOGLE_API_KEY`	(none)	Google provider key
`OPENROUTER_API_KEY`	(none)	OpenRouter key
`NVIDIA_API_KEY`	(none)	NVIDIA key (if not in config)
`QWEN_PORTAL_TOKEN`	(none)	qwen-portal OAuth token
`OPENAI_CODEX_TOKEN`	(none)	Override token for openai-codex

Docker

# Start with docker compose
docker compose up

# Manual run (mounts your OpenClaw config read-only)
docker build -t claw-auto-router .
docker run -p 43123:43123 \
  -v ~/.openclaw:/root/.openclaw:ro \
  -e ZAI_API_KEY=your-key \
  claw-auto-router

Troubleshooting

"No resolvable candidates" → OpenClaw Gateway is unavailable or OpenClaw cannot resolve any enabled models. Check openclaw gateway status, then inspect GET /stats → configStatus.warnings.

Provider in fallbacks but not in routing pool → Phantom ref — add that provider/model to your OpenClaw config so openclaw models list --json can see it.

"env_missing" but key is set → Check the provider auth inside OpenClaw itself. claw-auto-router now delegates auth/model execution back through OpenClaw Gateway.

502 All providers failed → All providers returned errors. Check GET /stats for per-model failure counts and server logs for specific HTTP errors.

Natural-language model switch did not stick → Send a stable session identifier (session_id, user, or x-session-id) so claw-auto-router can remember the override across turns.

Wizard doesn't appear → claw-auto-router only runs the wizard when stdin/stdout are TTYs. In Docker or CI, set modelTiers in router.config.json manually.

Release automation

npm publishing is handled by GitHub Actions trusted publishing in publish.yml.

Bump the version in package.json
Register yuga-hashimoto/claw-auto-router + .github/workflows/publish.yml once as an npm trusted publisher
Push to main or run the workflow manually from GitHub Actions
The workflow runs pnpm typecheck, pnpm test, and pnpm build
If that version is not already on npm, it publishes automatically without an npm token
The same workflow also creates a vX.Y.Z Git tag and GitHub Release with generated release notes

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
.github		.github
src		src
tests		tests
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
docker-compose.yml		docker-compose.yml
eslint.config.js		eslint.config.js
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
tsconfig.build.json		tsconfig.build.json
tsconfig.json		tsconfig.json
vitest.config.ts		vitest.config.ts

Folders and files

Latest commit

History

Repository files navigation

claw-auto-router

Why this exists

Architecture

Routing tiers

Setup wizard

How OpenClaw config is imported

API key resolution

Quick start

Easiest install: npm

Install from npm

No-Node install: Docker Compose

What you need

1. Clone and start

2. Add keys only if needed

3. Verify it is up

Local install: Node.js + pnpm

What you need

router.config.json

Conversation controls

API reference

POST /v1/chat/completions

GET /v1/models

GET /health

GET /stats

GET /dashboard

POST /reload-config

Pointing OpenClaw at claw-auto-router

Environment variables

Docker

Troubleshooting

Release automation

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 10

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`router.config.json`

`POST /v1/chat/completions`

`GET /v1/models`

`GET /health`

`GET /stats`

`GET /dashboard`

`POST /reload-config`

Packages