Skip to content

feat: per-agent model override + live Ollama capability audit#9

Open
m-marinucci wants to merge 1 commit intodisler:mainfrom
m-marinucci:feat/agent-model-audit
Open

feat: per-agent model override + live Ollama capability audit#9
m-marinucci wants to merge 1 commit intodisler:mainfrom
m-marinucci:feat/agent-model-audit

Conversation

@m-marinucci
Copy link

Summary

  • Adds per-agent model and thinking frontmatter fields to agent definitions — each agent can run on a different LLM provider/model
  • Adds live Ollama model capability auditing at team activation — queries /api/show for the capabilities array and compares digests against registry.ollama.com for updates
  • Adds dispatch-time blocking for models that lack tool-calling capability
  • Adds /agents-check command for manual re-audit

Why

When building agent teams with local Ollama models, it's easy to assign a model that doesn't support tool calling (e.g., deepseek-r1:8b has ["completion", "thinking"] but no "tools"). The agent silently fails. This feature catches that upfront.

How it works

Three severity levels at team activation:

Level Condition Action
BLOCK Model lacks "tools" in Ollama capabilities Dispatch blocked with error
WARN Has tools but < 30B params Warning notification
UPDATE Newer version on registry Suggests ollama pull

Per-agent model override via frontmatter:

---
name: data-analyst
model: m3-ollama/qwen3-coder:latest
thinking: medium
tools: read,bash,edit,write,grep,find,ls
---

Agents without model inherit the dispatcher's model. Cloud providers are never checked.

Security hardening (from adversarial review)

  • Registry URL sanitized via SAFE_REGISTRY_NAME regex — rejects path traversal
  • Inverted to LOCAL_PROVIDERS allowlist (forward-compatible vs incomplete cloud list)
  • Failed Ollama checks not cached (transient failures don't poison future dispatches)
  • dispatchAgent runs live capability check when cache is cold (no silent bypass)

Test plan

  • Verify /agents-check queries Ollama and reports findings
  • Assign model: m3-ollama/deepseek-r1:8b to an agent — confirm dispatch is BLOCKED
  • Assign model: m3-ollama/llama3.2:3b — confirm WARN (< 30B)
  • Assign model: m3-ollama/qwen3-coder:latest — confirm no warnings (30B+, has tools)
  • Verify agents without model field inherit dispatcher model as before
  • Verify cloud models (anthropic/, openrouter/) skip all Ollama checks

🤖 Generated with Claude Code

Add three features to agent-team.ts:

1. Per-agent model/thinking override via frontmatter fields:
   Agents can now declare `model: provider/model-id` and
   `thinking: level` in their .md definition. Falls back to
   the dispatcher's model when not set.

2. Live Ollama model capability audit:
   On team activation, queries Ollama /api/show for each agent
   using a local model. Checks the `capabilities` array for
   tool-calling support, warns on sub-30B parameter models,
   and compares local digests against registry.ollama.com for
   available updates.

   Three severity levels:
   - BLOCK: model lacks "tools" capability — dispatch is blocked
   - WARN: has tools but < 30B params — unreliable for agentic use
   - UPDATE: newer version available on ollama.com

3. Dispatch-time gate:
   Before spawning a sub-agent with a local model, checks the
   capability cache (or runs a live check if cache is cold).
   Blocks dispatch with a clear error if the model cannot do
   tool calling.

New command: /agents-check — clears cache and re-audits.

Security hardening from adversarial review:
- Registry URL sanitized via SAFE_REGISTRY_NAME regex
- Inverted to LOCAL_PROVIDERS allowlist (forward-compatible)
- Failed Ollama checks not cached (transient failures don't poison)
- dispatchAgent is now async for live capability checks

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant