Docs · Getting Started · Demo Catalog · Discord · X
Perstack is a containerized harness for agentic apps.
- Harness = Runtime + Config — Instructions, agent topology, and tools are defined in TOML — not wired in code. The runtime executes what you declare in config.
- Dev-to-prod in one container — Same image, same sandbox, same behavior from local to production.
- Full observability — Trace every delegation, token, and reasoning step. Replay any run from checkpoints.
Perstack draws clear boundaries — between your app and the harness, between the harness and each agent — so you can keep building without fighting the mess.
Perstack keeps expert definition, orchestration, and application integration as separate concerns.
create-expert scaffolds experts, the harness handles orchestration, and deployment stays simple because Perstack runs on standard container and serverless infrastructure.
To get started, use the built-in create-expert expert to scaffold your first agent:
# Use `create-expert` to scaffold a micro-agent team named `bash-gaming`
docker run --pull always --rm -it \
--env-file .env \
-v ./bash-gaming:/workspace \
perstack/perstack start create-expert \
--provider <provider> \
--model <model> \
"Form a team named bash-gaming. They build indie CLI games with both AI-facing non-interactive mode and human-facing TUI mode built on Ink + React. Their games must be runnable via npx at any time. Games are polished, well-tested with full playthroughs — TUI mode included."create-expert is a built-in expert. It defines a team of single-purpose micro-agents — called "experts" in Perstack.
create-expert : Thin coordinator that delegates to the experts
├── @create-expert/plan : Expands the user's request into a comprehensive plan
├── @create-expert/write : Produces perstack.toml from plan
└── @create-expert/verify : Runs the expert with a test query and checks the completion
The full definition is available at definitions/create-expert/perstack.toml.
While create-expert is running, the TUI shows real-time status — active delegation tree, token usage, reasoning streams, and per-agent progress:
2026/03/13 08:15:40.083, @bash-gaming/build, ⎇ bash-gaming
● Reasoning
....
───────────────────────────────────────────────────────────────────────────────
Query: Form a team named bash-gaming. They build indie CLI games with both…
1 running · 3 waiting · 4m 06s · fireworks
Tokens: In 2.9M (Cached 2.1M, Cache Hit 72.69%) · Out 46.2k
⏸ create-expert · accounts/fireworks/models/kimi-k2p5 · ○ 2.4% · Waiting for delegates
└ ⏸ @create-expert/verify · accounts/fireworks/models/kimi-k2p5 · ◔ 11.2% · Waiting for delegates
└ ⏸ bash-gaming · accounts/fireworks/models/kimi-k2p5 · ◔ 6.4% · Waiting for delegates
└ ⠇ @bash-gaming/build · accounts/fireworks/models/kimi-k2p5 · ◔ 3.2% · Streaming Reasoning...
To run your experts on an actual task, use the perstack start command:
# Let `bash-gaming` build a Wizardry-like dungeon crawler
docker run --pull always --rm -it \
--env-file .env \
-v ./<result-dir>:/workspace \
-v ./bash-gaming/perstack.toml:/definitions/perstack.toml:ro \
perstack/perstack start bash-gaming \
--config /definitions/perstack.toml \
--provider <provider> \
--model <model> \
"Create a Wizardry-like dungeon crawler in a fixed 10-floor labyrinth with complex layouts, traps, fixed room encounters, and random battles. Include special-effect gear drops, leveling, and a skill tree for one playable character. Balance difficulty around build optimization. Death in the dungeon causes loss of one random equipped item."Here is an example game built with these commands: demo-catalog. Across 5 runs on 4 providers, the same experts and queries were used. 4 out of 5 runs produced a working dungeon crawler. Full run logs are included in the repository.
perstack log provides a TUI for browsing past runs and their delegation trees. Every delegation — who called whom, what succeeded, what failed — is visible at a glance:
$ npx perstack log --job <jobId>
Runs (create-expert) Enter:Select b:Back q:Quit
> ⎇ create-expert Form a team named bash-gaming. They build indie CLI games with both AI-faci…
| \
| ✓ @create-expert/plan Create a team named bash-gaming. They build indie CLI games with bo…
| /
⎇ create-expert (resumed)
| \
| ✓ @create-expert/write Create perstack.toml at /workspace/plan.md. This is a new team cre…
| /
⎇ create-expert (resumed)
| \
| ⎇ @create-expert/verify Verify perstack.toml at /workspace/perstack.toml against plan at …
| | \
| | ⎇ bash-gaming Create a CLI word guessing game called 'cryptoword' published as @bash-ga…
| | | \
| | | ✓ @bash-gaming/plan Create a CLI word guessing game 'cryptoword' published as @bash-g…
| | | /
| | ⎇ bash-gaming (resumed)
| | | \
| | | ✓ @bash-gaming/build Implement the complete cryptoword game package at /home/perstack…
| | | /
| | ⎇ bash-gaming (resumed)
| | | \
| | | ✓ @bash-gaming/verify Verify the cryptoword package at /home/perstack/cryptoword/: 1…
| | | /
| | ✓ bash-gaming (resumed)
| | /
| ✓ @create-expert/verify (resumed)
| /
✓ create-expert (resumed)Perstack separates the agent harness from the application layer. Your app stays a normal web or terminal app, with no LLM dependencies in the client.
┌─────────────────┐ ┌──────────────────┐
│ Your app │ events │ perstack run │
│ (React, TUI…) │ ◄─────────── │ (@perstack/ │
│ │ SSE / WS / │ runtime) │
│ @perstack/ │ any stream │ │
│ react │ │ │
└─────────────────┘ └──────────────────┘
Frontend Server
Swap models, change agent topology, or scale the harness — without touching application code. @perstack/react provides hooks (useJobStream, useRun) that turn the event stream into React state. See the documentation for details.
FROM perstack/perstack:latest
# Install extra dependencies and configure a non-root user here if needed:
# RUN apt-get update && apt-get install -y --no-install-recommends git && rm -rf /var/lib/apt/lists/*
# RUN useradd -m agent
# USER agent
COPY perstack.toml .
RUN perstack install
ENTRYPOINT ["perstack", "run", "my-expert"]The image is Ubuntu-based, multi-arch (linux/amd64, linux/arm64), and is ~74 MB. perstack install pre-resolves MCP servers and prewarms tool definitions for faster, reproducible startups. The runtime can also be imported directly as a TypeScript library (@perstack/runtime) for serverless environments. See the deployment guide for details.
- Docker
- An LLM provider API key (see Providers and Models)
There are two ways to provide API keys:
1. Pass host environment variables with -e
Export the key on the host and forward it to the container:
export FIREWORKS_API_KEY=fw_...
docker run --rm -it \
-e FIREWORKS_API_KEY \
-v ./workspace:/workspace \
perstack/perstack start my-expert "query" --provider fireworks2. Store keys in a .env file in the workspace
Create a .env file in the workspace directory. Perstack loads .env and .env.local by default:
# ./workspace/.env
FIREWORKS_API_KEY=fw_...docker run --rm -it \
-v ./workspace:/workspace \
perstack/perstack start my-expert "query"You can also specify custom .env file paths with --env-path:
perstack start my-expert "query" --env-path .env.productionThree principles guide how Perstack approaches agentic app development:
- Quality is a system property, not a model property: Building agentic apps people actually use doesn't require an AI science degree—just a solid understanding of the problems you're solving.
- Keep your app simple and reliable: The harness is inevitably complex—Perstack absorbs that complexity so your agentic app doesn't have to.
- Do big things with small models: If a smaller model can do the job, there's no reason to use a bigger one.
Perstack introduces micro-agents — a multi-agent orchestration design built around purpose-specific agents, each with a single responsibility.
- Simple: A monolithic agent assembles its system prompt from hundreds of fragments. A multi-agent framework stacks abstraction layers and wires orchestration in code. A Perstack expert is one TOML section — instruction, delegates, done.
- Reliable: A plan agent that only plans, a build agent that only builds, a verify agent that only verifies — the pipeline structure itself prevents shortcuts and catches errors that a single generalist would miss.
- Reusable: Delegates are dependency management for agents — like npm packages or crates. Separate concerns through delegate chains, and compose purpose-built experts across different projects.
Perstack ships a five-layer stack that gives micro-agents everything they need to run.
┌──────────────────────────────────────────────────────────────────-┐
│ Interface │
│ CLI · Event streaming · Programmatic API │
├──────────────────────────────────────────────────────────────────-┤
│ Runtime │
│ Agentic loop · Event-sourcing · Checkpointing · Tool use │
├──────────────────────────────────────────────────────────────────-┤
│ Context │
│ System prompts · Prompt caching · AgenticRAG · Extended thinking │
├──────────────────────────────────────────────────────────────────-┤
│ Definition │
│ Multi-agent topology · MCP skills · Provider abstraction │
├──────────────────────────────────────────────────────────────────-┤
│ Infrastructure │
│ Sandbox isolation · Workspace boundary · Secret management │
└──────────────────────────────────────────────────────────────────-┘
Full feature matrix
| Layer | Feature | Description |
|---|---|---|
| Definition | perstack.toml |
Declarative project config with global defaults (model, reasoning budget, retries, timeout) |
| Expert definitions | Instruction, description, delegates, tags, version, and minimum runtime version per expert | |
| Skill types | MCP stdio, MCP SSE, and interactive skills with tool pick/omit filtering and domain restrictions | |
| Provider config | 9 providers (Anthropic, OpenAI, Google, Fireworks, DeepSeek, Ollama, Azure OpenAI, Amazon Bedrock, Google Vertex) with per-provider settings | |
| Model tiers | Provider-aware model selection via defaultModelTier (low / middle / high) with fallback cascade |
|
| Provider tools | Provider-native capabilities (web search, code execution, image generation, etc.) with per-tool options | |
| Lockfile | perstack.lock — resolved snapshot of experts and tool definitions for reproducible deployments |
|
| Context | Meta-prompts | Role-specific system prompts (coordinator vs. delegate) with environment injection (time, working directory, sandbox) |
| Context window tracking | Per-model context window lookup with usage ratio monitoring | |
| Message types | Instruction, user, expert, and tool messages with text, image, file, thinking, and tool-call parts | |
| Prompt caching | Provider-specific cache control with cache-hit tracking | |
| Delegation | Parallel child runs with isolated context, parent history preservation, and result aggregation | |
| Extended thinking | Provider-specific reasoning budgets (Anthropic thinking, OpenAI reasoning effort, Google thinking config) | |
| Token usage | Input, output, reasoning, cached, and total token tracking accumulated across steps and delegations | |
| Resume / continue | Resume from any checkpoint, specific job, or delegation stop point | |
| Runtime | State machine | 9-state machine (init → generate → call tools → resolve → finish, with delegation and interactive stops) |
| Event-sourcing | 21 run events, 6 streaming events, and 5 runtime events for full execution observability | |
| Checkpoints | Immutable state snapshots with messages, usage, pending tool calls, and delegation metadata | |
| Skill manager | Dynamic skill lifecycle — connect, discover tools, execute, disconnect — with adapter pattern | |
| Tool execution | Parallel MCP tool calls with priority classification (MCP → delegate → interactive) | |
| Error handling | Configurable retries with provider-specific error normalization and retryability detection | |
| Job hierarchy | Job → run → checkpoint structure with step continuity across delegations | |
| Streaming | Real-time reasoning and result deltas via streaming callbacks | |
| Infrastructure | Container isolation | Docker image (Ubuntu, multi-arch, ~74 MB) with PERSTACK_SANDBOX=1 marker |
| Workspace boundaries | Path validation with symlink resolution to prevent traversal and escape attacks | |
| Env / secrets | .env loading with --env-path, requiredEnv minimal-privilege filtering, and protected-variable blocklist |
|
| Exec protection | Filtered environment for subprocesses blocking LD_PRELOAD, NODE_OPTIONS, and similar vectors |
|
| Install & lockfile | perstack install pre-resolves tool definitions for faster, reproducible startup |
|
| Interface | perstack CLI |
start (interactive TUI), run (JSON events), log (history query), install, and expert management commands |
| TUI | React/Ink terminal UI with real-time activity log, token metrics, delegation tree, and job/checkpoint browser | |
| JSON event stream | Machine-readable event output via perstack run with --filter for programmatic integration |
|
@perstack/runtime |
TypeScript library for serverless and custom apps — run() with event listener, checkpoint storage callbacks |
|
@perstack/react |
React hooks (useRun, useJobStream) and event-to-activity processing utilities |
|
| Studio | Expert lifecycle management — create, push, version, publish, yank — via Perstack API | |
| Log system | Query execution history by job, run, step, or event type with terminal and JSON formatters |
| Topic | Link |
|---|---|
| Getting started | Getting Started |
| Architecture and core concepts | Understanding Perstack |
| Skills | Skills |
| Base skill (built-in tools) | Base Skill |
| Adding tools via MCP | Extending with Tools |
| Deployment | Deployment |
| Providers and models | Providers and Models |
| CLI reference | CLI Reference |
perstack.toml reference |
perstack.toml Reference |
| Events reference | Events Reference |
| API reference | API Reference |
demo-catalog runs the same experts and queries across multiple providers and models. Every run includes raw checkpoints and event logs — fully traceable, replayable, and ready for your own analysis. New demos and provider results are added continuously.
See CONTRIBUTING.md.
Apache License 2.0 — see LICENSE for details.