ARISE — Adaptive Runtime Improvement through Self-Evolution

Your agent works great on the tasks you planned for. ARISE handles the ones you didn't.

ARISE is a framework-agnostic middleware that gives LLM agents the ability to create their own tools at runtime. When your agent fails at a task, ARISE detects the capability gap, synthesizes a Python tool, validates it in a sandbox, and promotes it to the active library — no human intervention required.

Documentation | Quick Start | PyPI

pip install arise-ai

from arise import ARISE
from arise.rewards import task_success

arise = ARISE(
    agent_fn=my_agent,           # any (task, tools) -> str function
    reward_fn=task_success,
    model="gpt-4o-mini",         # cheap model for tool synthesis
)

result = arise.run("Fetch all users from the paginated API")
# Agent fails → ARISE synthesizes fetch_all_paginated tool → agent succeeds

What It Looks Like

Episode 1  | FAIL  | reward=0.00 | skills=2   Task: "Fetch paginated users with auth"
Episode 2  | FAIL  | reward=0.00 | skills=2
Episode 3  | FAIL  | reward=0.00 | skills=2

[Evolution triggered — 3 failures on API tasks]
  → Synthesizing 'parse_json_response'... 3/3 tests passed ✓
  → Synthesizing 'fetch_all_paginated'... sandbox fail → refine → 1/1 passed ✓

Episode 4  | OK    | reward=1.00 | skills=4   Agent now has the tools it needs

Key Features

Self-evolving tool library — fail → detect gap → synthesize → sandbox test → promote
Framework-agnostic — any (task, tools) -> str function, Strands, LangGraph, CrewAI
Sandboxed validation — subprocess or Docker, adversarial testing, import restrictions
Distributed mode — S3 + SQS for stateless deployments (Lambda, ECS, AgentCore)
Skill registry — share evolved tools across projects
Version control + rollback — SQLite checkpoints, arise rollback <version>
A/B testing — refined skills tested against originals before promotion
Web Console — create agents, watch evolution live, inspect evolved code (arise console)
Dashboard — terminal TUI and web UI for monitoring

Benchmark Results

Model	Condition	AcmeCorp (SRE)	DataCorp (Data Eng)
Claude Sonnet	ARISE	78%	—
Claude Sonnet	No tools	63%	—
GPT-4o-mini	ARISE	57%	92%
GPT-4o-mini	No tools	48%	50%

ARISE improves task success by +9–42 percentage points across models and domains. See the full benchmark results.

ARISE Console

A web UI for creating agents, watching evolution live, and inspecting evolved tools:

arise console
# Opens http://localhost:8080

Create agents — pick model, set system prompt, choose reward function
Live terminal feed — watch episodes and evolution in real-time via WebSocket
Skill inspector — syntax-highlighted code, test suite, performance metrics
Editable config — change reward function, system prompt, failure threshold on the fly
All Skills / Evolution Log — global views across all agents

Documentation

Full documentation at arise-ai.dev:

Installation — install and configure
Quick Start — complete evolution loop walkthrough
How It Works — the 5-step evolution pipeline
Reward Functions — built-in and custom reward functions
Safety & Validation — sandbox, adversarial testing, production recommendations
Distributed Mode — S3 + SQS for stateless deployments
Framework Adapters — Strands, LangGraph, CrewAI, raw OpenAI/Anthropic
CLI Reference — all CLI commands
API Reference — ARISE class, config, types

Examples

Example	Description
`quickstart_evolution.py`	Full evolution loop: agent fails → ARISE evolves tool → agent succeeds
`quickstart.py`	Math agent evolves statistics tools
`api_agent.py`	HTTP agent evolves auth + pagination (mock server)
`devops_agent.py`	DevOps agent evolves log analysis tools
`strands_agent.py`	Strands integration with Bedrock
`demo/agentcore/`	AgentCore deployment with A2A protocol

Install

pip install arise-ai              # core (just pydantic)
pip install arise-ai[aws]         # + boto3 for distributed mode
pip install arise-ai[litellm]     # + litellm for multi-provider LLM
pip install arise-ai[docker]      # + docker sandbox backend
pip install arise-ai[dashboard]   # + rich, fastapi for dashboard
pip install arise-ai[otel]        # + opentelemetry for tracing
pip install arise-ai[all]         # everything

Related Work

ARISE builds on ideas from LATM, VOYAGER, CREATOR, ADAS, and CRAFT. ARISE adds the production layer: framework-agnostic integration, sandboxed validation, adversarial testing, version control, distributed deployment, and A/B testing.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 90 Commits
.github/workflows		.github/workflows
arise		arise
benchmarks		benchmarks
console		console
demo		demo
docs-site		docs-site
docs/superpowers		docs/superpowers
examples		examples
tests		tests
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ARISE — Adaptive Runtime Improvement through Self-Evolution

What It Looks Like

Key Features

Benchmark Results

ARISE Console

Documentation

Examples

Install

Related Work

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ARISE — Adaptive Runtime Improvement through Self-Evolution

What It Looks Like

Key Features

Benchmark Results

ARISE Console

Documentation

Examples

Install

Related Work

License

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages