Skip to content

feat: model routing profiles with per-context model/provider selection#1168

Closed
virtaava wants to merge 7 commits intoNousResearch:mainfrom
virtaava:feat/model-routing-profiles
Closed

feat: model routing profiles with per-context model/provider selection#1168
virtaava wants to merge 7 commits intoNousResearch:mainfrom
virtaava:feat/model-routing-profiles

Conversation

@virtaava
Copy link
Contributor

Summary

Adds model routing profiles — optional per-context model and provider overrides that let users route different work types to different models. Includes an interactive configuration wizard, routing rules for delegate_task, and a config backup/restore system so configuration changes are always reversible.

Motivation

Hermes currently uses a single model for all operations. In practice, different contexts benefit from different models:

  • A strong reasoning model for planning and architecture
  • A fast code-focused model for terminal-heavy delegated work
  • A web-optimized model for research subtasks
  • A cost-effective model for routine delegation

Today this requires manually editing config.yaml with no discovery, validation, or undo.

What's included

Model routing profiles (config.yaml)

model_profiles:
  chat:
    model: anthropic/claude-sonnet-4
    provider: openrouter
  coding:
    model: openai/gpt-5-codex
    provider: openrouter
  planning:
    model: google/gemini-3-flash-preview
    provider: openrouter
  research:
    model: perplexity/sonar-deep-research
    provider: openrouter

Each profile supports model, provider, base_url, api_key_env — works with any OpenAI-compatible endpoint including local servers (Ollama, vLLM, llama.cpp). Empty profiles inherit from the primary model.

Interactive configuration wizard

hermes configure-model-routing             # interactive setup (auto-backup)
hermes configure-model-routing --reset     # reset to defaults (auto-backup)
hermes configure-model-routing --restore   # roll back routing sections from backup
hermes configure-model-routing --restore-full  # roll back entire config from backup
hermes configure-model-routing --list-backups  # show available backups
  • Auto-discovers providers with working credentials
  • Fetches live model lists from each provider's /models endpoint
  • Per-profile confirmation — never overwrites without asking
  • Separate from hermes model (which remains the primary model selector)

Config backup/restore

Every config-modifying routing operation auto-creates a timestamped backup in ~/.hermes/config-backups/ before writing. Restores can target just model_profiles + model_routing (leaving other config untouched) or the full config. Even restore creates a pre_restore backup, so undo is itself undoable. Auto-rotates at 10 backups.

Routing rules for delegate_task

model_routing:
  rules:
    - if_toolsets_any: [terminal, file]
      profile: coding
    - if_toolsets_any: [web, browser]
      profile: research
    - if_goal_matches: [plan, spec, roadmap]
      profile: planning

First matching rule wins. Falls back to toolset heuristics when no rules match.

delegate_task integration

Explicit profile per call:

{"goal": "Refactor the parser module", "model_profile": "coding"}

Or batch mode with mixed profiles:

{
  "tasks": [
    {"goal": "Refactor parser", "toolsets": ["terminal", "file"], "model_profile": "coding"},
    {"goal": "Research alternatives", "toolsets": ["web"], "model_profile": "research"}
  ]
}

Precedence rules

Chat model

  1. CLI --model argument
  2. model_profiles.chat.model
  3. model.default
  4. Built-in fallback

delegate_task routing

  1. Legacy delegation.model/delegation.provider (if set)
  2. Explicit model_profile on tool call
  3. delegation.model_profile default
  4. Routing rules (model_routing.rules)
  5. Toolset-based heuristic fallback

Backward compatibility

  • Existing configs work unchanged — all new sections are optional
  • hermes model behavior is unchanged
  • delegate_task without model_profile uses existing behavior
  • save_config() API is backward compatible (new backup_reason kwarg defaults to empty)
  • Legacy delegation.model/delegation.provider still takes precedence

Test plan

  • test_config_backup.py — 15 tests: create, prune, list, selective restore, full restore, pre-restore safety, CLI integration
  • test_model_routing_command.py — reset with backup, list-backups empty state
  • test_delegate.py — profile resolution, routing rules, batch mode
  • test_setup.py — wizard flow, profile normalization, provider discovery
  • test_runtime_provider_resolution.py — profile inheritance, primary model fallback
  • Interactive wizard respects TTY detection — skips in CI/automation
  • Cross-platform: Path, shutil.copy2, no Unix-specific operations
  • 318 tests pass

Files changed

File Change
hermes_cli/config_backup.py New — backup/restore/prune/list module
hermes_cli/config.py model_profiles, model_routing, delegation.model_profile in defaults; backup_reason on save_config
hermes_cli/runtime_provider.py resolve_model_profile(), get_primary_model(), resolve_model_for_profile()
hermes_cli/setup.py Interactive wizard, reset, provider discovery
hermes_cli/main.py cmd_configure_model_routing with --restore/--restore-full/--list-backups
cli.py Chat profile resolution in HermesCLI.__init__
gateway/run.py Gateway chat model reads model_profiles.chat
cron/scheduler.py Cron reads chat profile
tools/delegate_tool.py Profile-aware delegation with routing rules
docs/model-profiles.md End-user documentation
tests/ 6 test files, 100+ new test cases

🤖 Generated with Claude Code

virtaava and others added 7 commits March 13, 2026 16:33
Every config-modifying routing operation now auto-creates a timestamped
backup in ~/.hermes/config-backups/ before writing. Users can restore
from any backup via --restore (routing sections only) or --restore-full.

New CLI flags:
  hermes configure-model-routing --restore
  hermes configure-model-routing --restore-full
  hermes configure-model-routing --list-backups

Also adds get_primary_model() to runtime_provider for wizard display.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
shutil.copy2 preserves the source file's mtime, so multiple backups
of the same file get identical mtimes making ordering unreliable.
shutil.copy lets the backup's mtime reflect when it was actually created.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@teknium1
Copy link
Contributor

Closing this for now — but I want to be clear that this is genuinely impressive work, @virtaava. 1700+ lines across routing profiles, an interactive config wizard, config backup/restore, delegation routing, and full test coverage. The design is thoughtful and the implementation is thorough.

The reason I'm not merging this right now isn't quality — it's timing and direction. Model routing is something we're thinking about, but we want to land it in a way that fits with some other architectural changes we have in mind (auxiliary client unification, provider routing rework, etc.). Merging a large feature like this now would create friction with that work.

If/when we revisit model routing profiles, this PR will be the reference implementation. The config schema design, the wizard UX, and the delegation routing hooks are all things worth drawing from. I'd genuinely welcome your continued involvement — whether that's a fresh PR when the time comes or input on the design direction.

Thank you for the substantial effort here. This is the kind of contribution that makes open source great, even when the timing doesn't line up.

@teknium1 teknium1 closed this Mar 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants