Your agent works great on the tasks you planned for. ARISE handles the ones you didn't.
ARISE is a framework-agnostic middleware that gives LLM agents the ability to create their own tools at runtime. When your agent fails at a task, ARISE detects the capability gap, synthesizes a Python tool, validates it in a sandbox, and promotes it to the active library — no human intervention required.
Documentation | Quick Start | PyPI
pip install arise-aifrom arise import ARISE
from arise.rewards import task_success
arise = ARISE(
agent_fn=my_agent, # any (task, tools) -> str function
reward_fn=task_success,
model="gpt-4o-mini", # cheap model for tool synthesis
)
result = arise.run("Fetch all users from the paginated API")
# Agent fails → ARISE synthesizes fetch_all_paginated tool → agent succeedsEpisode 1 | FAIL | reward=0.00 | skills=2 Task: "Fetch paginated users with auth"
Episode 2 | FAIL | reward=0.00 | skills=2
Episode 3 | FAIL | reward=0.00 | skills=2
[Evolution triggered — 3 failures on API tasks]
→ Synthesizing 'parse_json_response'... 3/3 tests passed ✓
→ Synthesizing 'fetch_all_paginated'... sandbox fail → refine → 1/1 passed ✓
Episode 4 | OK | reward=1.00 | skills=4 Agent now has the tools it needs
- Self-evolving tool library — fail → detect gap → synthesize → sandbox test → promote
- Framework-agnostic — any
(task, tools) -> strfunction, Strands, LangGraph, CrewAI - Sandboxed validation — subprocess or Docker, adversarial testing, import restrictions
- Distributed mode — S3 + SQS for stateless deployments (Lambda, ECS, AgentCore)
- Skill registry — share evolved tools across projects
- Version control + rollback — SQLite checkpoints,
arise rollback <version> - A/B testing — refined skills tested against originals before promotion
- Web Console — create agents, watch evolution live, inspect evolved code (
arise console) - Dashboard — terminal TUI and web UI for monitoring
| Model | Condition | AcmeCorp (SRE) | DataCorp (Data Eng) |
|---|---|---|---|
| Claude Sonnet | ARISE | 78% | — |
| Claude Sonnet | No tools | 63% | — |
| GPT-4o-mini | ARISE | 57% | 92% |
| GPT-4o-mini | No tools | 48% | 50% |
ARISE improves task success by +9–42 percentage points across models and domains. See the full benchmark results.
A web UI for creating agents, watching evolution live, and inspecting evolved tools:
arise console
# Opens http://localhost:8080- Create agents — pick model, set system prompt, choose reward function
- Live terminal feed — watch episodes and evolution in real-time via WebSocket
- Skill inspector — syntax-highlighted code, test suite, performance metrics
- Editable config — change reward function, system prompt, failure threshold on the fly
- All Skills / Evolution Log — global views across all agents
Full documentation at arise-ai.dev:
- Installation — install and configure
- Quick Start — complete evolution loop walkthrough
- How It Works — the 5-step evolution pipeline
- Reward Functions — built-in and custom reward functions
- Safety & Validation — sandbox, adversarial testing, production recommendations
- Distributed Mode — S3 + SQS for stateless deployments
- Framework Adapters — Strands, LangGraph, CrewAI, raw OpenAI/Anthropic
- CLI Reference — all CLI commands
- API Reference — ARISE class, config, types
| Example | Description |
|---|---|
quickstart_evolution.py |
Full evolution loop: agent fails → ARISE evolves tool → agent succeeds |
quickstart.py |
Math agent evolves statistics tools |
api_agent.py |
HTTP agent evolves auth + pagination (mock server) |
devops_agent.py |
DevOps agent evolves log analysis tools |
strands_agent.py |
Strands integration with Bedrock |
demo/agentcore/ |
AgentCore deployment with A2A protocol |
pip install arise-ai # core (just pydantic)
pip install arise-ai[aws] # + boto3 for distributed mode
pip install arise-ai[litellm] # + litellm for multi-provider LLM
pip install arise-ai[docker] # + docker sandbox backend
pip install arise-ai[dashboard] # + rich, fastapi for dashboard
pip install arise-ai[otel] # + opentelemetry for tracing
pip install arise-ai[all] # everythingARISE builds on ideas from LATM, VOYAGER, CREATOR, ADAS, and CRAFT. ARISE adds the production layer: framework-agnostic integration, sandboxed validation, adversarial testing, version control, distributed deployment, and A/B testing.
MIT