Skip to content

Conversation

@1bcMax
Copy link
Member

@1bcMax 1bcMax commented Feb 9, 2026

Summary

Major update with intelligent routing improvements:

  • Agentic routing: Auto-detect tool requests and agentic tasks, route to models optimized for tool use (Claude, Kimi) instead of reasoning-focused models
  • Context-length-aware fallbacks: Filter out models that can't handle the estimated context size before trying them
  • Tool detection: When tools array is present, automatically switch to agentic tiers
  • Session persistence: Prevent mid-task model switching for better agent continuity
  • New models: Added xAI Grok 4 family, NVIDIA GPT-OSS 120B (free)
  • SSE streaming fix: Properly forward tool_calls in streaming responses
  • Stats command: Terminal-based /stats with cost savings dashboard

Test Results

Test Suite Result
Fallback tests 16/16 passed
Tool detection 4/4 passed
Context-aware routing All passed
Stats command Working

Breaking Changes

None - backwards compatible.

Commits

  • feat: context-length-aware routing
  • feat: auto-detect tool requests and force agentic routing
  • fix: forward tool_calls in SSE streaming response
  • feat: add new xAI/NVIDIA models with free fallback
  • feat: add session persistence
  • feat: add agentic mode for multi-step tasks
  • refactor: remove HTML dashboard, keep terminal /stats only

1bcMax added 15 commits February 8, 2026 20:34
- Apply BlockRun design patterns: dark bg, subtle borders, JetBrains Mono
- Add footer links: blockrun.ai, X/Twitter, GitHub repo
- Change chart to benchmark comparison: ClawRouter vs Claude Opus 4
- Use uppercase labels with letter-spacing per BlockRun style
Users can now use short aliases instead of full model paths:
- /model claude → anthropic/claude-sonnet-4
- /model kimi → moonshot/kimi-k2.5
- /model gpt → openai/gpt-4o
- /model flash → google/gemini-2.5-flash

Aliases: claude, sonnet, opus, haiku, gpt, gpt4, gpt5, mini, o3,
deepseek, reasoner, kimi, gemini, flash, grok
When agenticMode is enabled, the router prefers models optimized for
agentic workflows (Claude, Kimi K2.5) that continue autonomously instead
of stopping and waiting for user input.

- Added 'agentic' flag to model definitions
- Marked Claude, Kimi K2.5, GPT-5.2, GPT-4o as agentic
- Added agenticTiers config for agentic model preferences
- Router uses agentic tiers when agenticMode: true
- Added isAgenticModel() and getAgenticModels() helpers
When session persistence is enabled, ClawRouter maintains the same model
selection across multiple requests within a session, preventing disruptive
model switches during multi-step agentic tasks.

- Added SessionStore class for tracking session -> model mappings
- Sessions identified via X-Session-ID header
- Configurable timeout (default: 30 minutes)
- Auto-cleanup of expired sessions
- Exported SessionStore, getSessionId, DEFAULT_SESSION_CONFIG
- Add 15th dimension: agenticTaskKeywords (67 multilingual keywords)
- Auto-switch to agentic tiers when 2+ agentic signals detected
- Supports file ops, execution, multi-step patterns, iterative work
- No config required - works automatically

Test results:
- "what is 2+2" → gemini-flash (standard)
- "build the project then run tests" → kimi-k2.5 (auto-agentic)
- "fix the bug and make sure it works" → kimi-k2.5 (auto-agentic)
- Remove /dashboard endpoint from proxy
- Remove generateDashboardHtml function
- Keep getStats() and formatStatsAscii() for terminal display
- Simpler architecture, no remote access issues
- Remove "Hello" and "Define photosynthesis" from SIMPLE tier tests
  (now route to MEDIUM with adjusted scoring weights)
- Update REASONING tier test to expect deepseek-reasoner instead of o3
- Add local /v1/models endpoint that returns model list without upstream API call
- Update fallback test to accept deepseek or gemini for SIMPLE tier
- Bump to v0.4.9
- Add 9 new models: grok-4-fast-*, grok-code-fast-1, nvidia/gpt-oss-120b
- Update tier config: REASONING uses grok-4-fast-reasoning ($0.20/$0.50)
- Add nvidia/gpt-oss-120b as FREE fallback in SIMPLE tier
- Empty wallet auto-fallback: use free model instead of throwing error
- Update test expectations for new model routing
When model returns tool_calls instead of content (e.g., for function calling),
ClawRouter was silently dropping the tool_calls data during SSE conversion.
This caused OpenClaw to receive finish_reason: 'tool_calls' with empty content,
resulting in no reply being sent to users.

Now tool_calls are properly forwarded in the SSE stream.
- Remove 'View detailed dashboard' URL from /stats output (dashboard removed)
- Skip content-encoding header when forwarding upstream response
  (fetch auto-decompresses but was forwarding gzip header with plain body)
When tools array is present in request, automatically switch to
agentic tiers which use models better at tool use (claude-haiku-4.5,
claude-sonnet-4, kimi-k2.5) instead of reasoning-focused models
(deepseek-reasoner) that may generate malformed tool calls.
Add smart filtering that excludes models when estimated context
exceeds their max context window. This prevents wasted API calls
to models that will fail with context length errors.

Changes:
- Add getModelContextWindow() helper to models.ts
- Add getFallbackChainFiltered() to selector.ts
- Update proxy to filter fallback chain by context length
- Update fallback tests for agentic tier fallback order
- Add tool detection section
- Add context-length-aware routing section
- Add session persistence section
- Add /stats command documentation
- Update tier model mapping (grok-code-fast, grok-4-fast-reasoning)
- Add new xAI Grok 4 family and NVIDIA free models
- Update roadmap with completed features
@1bcMax 1bcMax merged commit 00227ed into main Feb 9, 2026
1 check failed
@1bcMax 1bcMax deleted the feat/dashboard branch February 9, 2026 20:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant