Evolutionary self-improvement for Hermes Agent.
Hermes Agent Self-Evolution uses DSPy + GEPA (Genetic-Pareto Prompt Evolution) to automatically evolve and optimize Hermes Agent's skills, tool descriptions, system prompts, and code — producing measurably better versions through reflective evolutionary search.
No GPU training required. Everything operates via API calls — mutating text, evaluating results, and selecting the best variants. ~$2-10 per optimization run.
Read current skill/prompt/tool ──► Generate eval dataset
│
▼
GEPA Optimizer ◄── Execution traces
│ ▲
▼ │
Candidate variants ──► Evaluate
│
Constraint gates (tests, size limits, benchmarks)
│
▼
Best variant ──► PR against hermes-agent
GEPA reads execution traces to understand why things fail (not just that they failed), then proposes targeted improvements. ICLR 2026 Oral, MIT licensed.
# Install
git clone https://github.com/NousResearch/hermes-agent-self-evolution.git
cd hermes-agent-self-evolution
pip install -e ".[dev]"
# Point at your hermes-agent repo
export HERMES_AGENT_REPO=~/.hermes/hermes-agent
# Evolve a skill
python -m evolution.skills.evolve_skill \
--skill github-code-review \
--iterations 10 \
--eval-source synthetic| Phase | Target | Engine | Status |
|---|---|---|---|
| Phase 1 | Skill files (SKILL.md) | DSPy + GEPA | ✅ Implemented |
| Phase 2 | Tool descriptions | DSPy + GEPA | 🔲 Planned |
| Phase 3 | System prompt sections | DSPy + GEPA | 🔲 Planned |
| Phase 4 | Tool implementation code | Darwinian Evolver | 🔲 Planned |
| Phase 5 | Continuous improvement loop | Automated pipeline | 🔲 Planned |
| Engine | What It Does | License |
|---|---|---|
| DSPy + GEPA | Reflective prompt evolution — reads execution traces, proposes targeted mutations | MIT |
| Darwinian Evolver | Code evolution with Git-based organisms | AGPL v3 (external CLI only) |
Every evolved variant must pass:
- Full test suite —
pytest tests/ -qmust pass 100% - Size limits — Skills ≤15KB, tool descriptions ≤500 chars
- Caching compatibility — No mid-conversation changes
- Semantic preservation — Must not drift from original purpose
- PR review — All changes go through human review, never direct commit
See PLAN.md for the complete architecture, evaluation data strategy, constraints, benchmarks integration, and phased timeline.
MIT — © 2026 Nous Research