diff --git a/README.md b/README.md
index 4e9b8d1..1e07094 100644
--- a/README.md
+++ b/README.md
@@ -1,499 +1,284 @@
-# KODE SDK · Event-Driven Agent Runtime
+# KODE SDK
 
-> **就像和资深同事协作**：发消息、批示、打断、分叉、续上 —— 一套最小而够用的 API，驱动长期在线的多 Agent 系统。
-
-## Why KODE
-
-- **Event-First**：UI 只订阅 Progress（文本/工具流）；审批与治理走 Control & Monitor 回调，默认不推噪音事件。
-- **长时运行 + 可分叉**：七段断点恢复（READY → POST_TOOL），Safe-Fork-Point 天然存在于工具结果与纯文本处，一键 fork 继续。
-- **同事式协作心智**：Todo 管理、提醒、Tool 手册自动注入，工具并发可限流，默认配置即安全可用。
-- **高性能且可审计**：统一 WAL、零拷贝文本流、工具拒绝必落审计、Monitor 事件覆盖 token 用量、错误与文件变更。
-- **可扩展生态**：原生接入 MCP 工具、Sandbox 驱动、模型 Provider、Store 后端、Scheduler DSL，支持企业级自定义。
-
-## 60 秒上手：跑通第一个“协作收件箱”
-
-```bash
-npm install @shareai-lab/kode-sdk
-export ANTHROPIC_API_KEY=sk-...        # 或 ANTHROPIC_API_TOKEN
-export ANTHROPIC_BASE_URL=https://...   # 可选，默认为官方 API
-export ANTHROPIC_MODEL_ID=claude-sonnet-4.5-20250929  # 可选
-
-npm run example:agent-inbox
-```
-
-输出中你会看到：
-
-- Progress 渠道实时流式文本 / 工具生命周期事件
-- Control 渠道的审批请求（示例中默认拒绝 `bash_run`）
-- Monitor 渠道的工具审计日志（耗时、审批结果、错误）
-
-想自定义行为？修改 `examples/01-agent-inbox.ts` 内的模板、工具与事件订阅即可。
-
-## 示例游乐场
-
-| 示例 | 用例 | 涵盖能力 |
-| --- | --- | --- |
-| `npm run example:getting-started` | 最小对话循环 | Progress 流订阅、Anthropic 模型直连 |
-| `npm run example:agent-inbox` | 事件驱动收件箱 | Todo 管理、工具并发、Monitor 审计 |
-| `npm run example:approval` | 工具审批工作流 | Control 回调、Hook 策略、自动拒绝/放行 |
-| `npm run example:room` | 多 Agent 协作 | AgentPool、Room 消息、Safe Fork、Lineage |
-| `npm run example:scheduler` | 长时运行 & 提醒 | Scheduler 步数触发、系统提醒、FilePool 监控 |
-| `npm run example:nextjs` | API + SSE | Resume-or-create、Progress 流推送（无需安装 Next） |
-
-每个示例都位于 `examples/` 下，对应 README 中的学习路径，展示事件驱动、审批、安全、调度、协作等核心能力的组合方式。
-
-## 构建属于你的协作型 Agent
-
-1. **理解三通道心智**：详见 [`docs/events.md`](./docs/events.md)。
-2. **跟着 Quickstart 实战**：[`docs/quickstart.md`](./docs/quickstart.md) 从 “依赖注入 → Resume → SSE” 手把手搭建服务。
-3. **扩展用例**：[`docs/playbooks.md`](./docs/playbooks.md) 涵盖审批治理、多 Agent 小组、调度提醒等典型场景。
-4. **查阅 API**：[`docs/api.md`](./docs/api.md) 枚举 `Agent`、`EventBus`、`ToolRegistry` 等核心类型与事件。
-5. **深挖能力**：Todo、ContextManager、Scheduler、Sandbox、Hook、Tool 定义详见 `docs/` 目录。
-
-## 基础设计一图流
+> **Stateful Agent Runtime Kernel** - The engine that powers your AI agents with persistence, recovery, and trajectory exploration.
 
 ```
-Client/UI ── subscribe(['progress']) ──┐
-Approval service ── Control 回调 ─────┼▶ EventBus（三通道）
-Observability ── Monitor 事件 ────────┘
-
-             │
-             ▼
-   MessageQueue → ContextManager → ToolRunner
-             │             │             │
-             ▼             ▼             ▼
-        Store (WAL)    FilePool      PermissionManager
+                    +------------------+
+                    |   Your App       |  CLI / Desktop / IDE / Server
+                    +--------+---------+
+                             |
+                    +--------v---------+
+                    |    KODE SDK      |  Agent Runtime Kernel
+                    |  +-----------+   |
+                    |  |  Agent    |   |  Lifecycle + State + Events
+                    |  +-----------+   |
+                    |  |  Store    |   |  Persistence (Pluggable)
+                    |  +-----------+   |
+                    |  |  Sandbox  |   |  Execution Isolation
+                    |  +-----------+   |
+                    +------------------+
 ```
 
-## Provider Adapter Pattern
-
-KODE SDK uses **Anthropic-style messages as the internal canonical format**. All model providers are implemented as adapters that convert to/from this internal format, enabling seamless switching between providers while maintaining a consistent message structure.
-
-### Internal Message Format
+---
 
-```typescript
-interface Message {
-  role: 'user' | 'assistant' | 'system';
-  content: ContentBlock[];
-  metadata?: {
-    content_blocks?: ContentBlock[];
-    transport?: 'provider' | 'text' | 'omit';
-  };
-}
-
-type ContentBlock =
-  | { type: 'text'; text: string }
-  | { type: 'reasoning'; reasoning: string; meta?: { signature?: string } }
-  | { type: 'image'; base64?: string; url?: string; mime_type?: string; file_id?: string }
-  | { type: 'audio'; base64?: string; url?: string; mime_type?: string }
-  | { type: 'file'; base64?: string; url?: string; filename?: string; mime_type?: string; file_id?: string }
-  | { type: 'tool_use'; id: string; name: string; input: any; meta?: Record<string, any> }
-  | { type: 'tool_result'; tool_use_id: string; content: any; is_error?: boolean };
-```
+## What is KODE SDK?
 
-### Message Flow
+KODE SDK is an **Agent Runtime Kernel** - think of it like V8 for JavaScript, but for AI agents. It handles the complex lifecycle management so you can focus on building your agent's capabilities.
 
-```
-Internal Message[] (Anthropic-style ContentBlocks)
-  -> Provider.formatMessages() -> External API format
-  -> API call
-  -> Response -> normalizeContent() -> Internal ContentBlock[]
-```
+**Core Capabilities:**
+- **Crash Recovery**: WAL-protected persistence with 7-stage breakpoint recovery
+- **Fork & Resume**: Explore different agent trajectories from any checkpoint
+- **Event Streams**: Progress/Control/Monitor channels for real-time UI updates
+- **Tool Governance**: Permission system, approval workflows, audit trails
 
-### Supported Providers
-
-| Provider | API | Thinking Support | Files | Streaming |
-|----------|-----|------------------|-------|-----------|
-| Anthropic | Messages API | Extended Thinking (`interleaved-thinking-2025-05-14`) | Files API | SSE |
-| OpenAI | Chat Completions | `<think>` tags | - | SSE |
-| OpenAI | Responses API | `reasoning_effort` (store, previous_response_id) | File uploads | SSE |
-| Gemini | Generate Content | `thinkingLevel` (3.x) | GCS URIs | SSE |
-| DeepSeek | Chat Completions | `reasoning_content` (auto-strip from history) | - | SSE |
-| Qwen | Chat Completions | `thinking_budget` | - | SSE |
-| GLM | Chat Completions | `thinking.type: enabled` | - | SSE |
-| Minimax | Chat Completions | `reasoning_split` | - | SSE |
-| Kimi K2 | Chat Completions | `reasoning` field | - | SSE |
-| Groq/Cerebras | Chat Completions | OpenAI-compatible | - | SSE |
+**What KODE SDK is NOT:**
+- Not a cloud platform (you deploy it)
+- Not an HTTP server (you add that layer)
+- Not a multi-tenant SaaS framework (you build that on top)
 
 ---
 
-## Provider Message Conversion Details
+## When to Use KODE SDK
 
-### Anthropic Provider
+### Perfect Fit (Use directly)
 
-**API Format**: Anthropic Messages API with extended thinking beta
+| Scenario | Why It Works |
+|----------|--------------|
+| **CLI Agent Tools** | Single process, local filesystem, zero config |
+| **Desktop Apps** (Electron/Tauri) | Full system access, long-running process |
+| **IDE Plugins** (VSCode/JetBrains) | Single user, workspace integration |
+| **Local Development** | Fast iteration, instant persistence |
 
-```typescript
-// Internal -> Anthropic
-{
-  role: 'user' | 'assistant',
-  content: [
-    { type: 'text', text: '...' },
-    { type: 'thinking', thinking: '...', signature?: '...' },  // Extended thinking
-    { type: 'image', source: { type: 'base64', media_type: '...', data: '...' } },
-    { type: 'document', source: { type: 'file', file_id: '...' } },  // Files API
-    { type: 'tool_use', id: '...', name: '...', input: {...} },
-    { type: 'tool_result', tool_use_id: '...', content: '...' }
-  ]
-}
-```
+### Good Fit (With architecture)
 
-**Extended Thinking Configuration**:
-```typescript
-const provider = new AnthropicProvider(apiKey, model, baseUrl, proxyUrl, {
-  reasoningTransport: 'provider',  // 'provider' | 'text' | 'omit'
-  thinking: {
-    enabled: true,
-    budgetTokens: 10000  // Maps to thinking.budget_tokens
-  }
-});
-```
+| Scenario | What You Need |
+|----------|---------------|
+| **Self-hosted Server** | Add HTTP layer (Express/Fastify/Hono) |
+| **Small-scale Backend** (<1K users) | Implement PostgresStore, add user isolation |
+| **Kubernetes Deployment** | Implement distributed Store + locks |
 
-**Beta Headers**: Automatically added based on message content:
-- `interleaved-thinking-2025-05-14`: When `reasoningTransport === 'provider'`
-- `files-api-2025-04-14`: When messages contain file blocks with `file_id`
+### Needs Custom Architecture
 
-**Signature Preservation**: Critical for multi-turn conversations with Claude 4+. The SDK preserves thinking block signatures in `meta.signature`.
+| Scenario | Recommended Approach |
+|----------|---------------------|
+| **Large-scale ToC** (10K+ users) | Worker microservice pattern (see [Architecture Guide](./docs/ARCHITECTURE.md)) |
+| **Serverless** (Vercel/Cloudflare) | API layer on serverless + Worker pool for agents |
+| **Multi-tenant SaaS** | Tenant isolation layer + distributed Store |
 
----
+### Not Designed For
 
-### OpenAI Provider (Chat Completions API)
+| Scenario | Reason |
+|----------|--------|
+| **Pure browser runtime** | No filesystem, no process execution |
+| **Edge functions only** | Agent loops need long-running processes |
+| **Stateless microservices** | Agents are inherently stateful |
 
-**API Format**: OpenAI Chat Completions
+> **Rule of Thumb**: If your agents need to run for more than a few seconds, execute tools, and remember state - KODE SDK is for you. If you just need stateless LLM calls, use the provider APIs directly.
 
-```typescript
-// Internal -> OpenAI Chat
-{
-  role: 'system' | 'user' | 'assistant' | 'tool',
-  content: '...' | [{ type: 'text', text: '...' }, { type: 'image_url', image_url: { url: '...' } }],
-  tool_calls?: [{ id: '...', type: 'function', function: { name: '...', arguments: '...' } }],
-  reasoning_content?: '...',  // DeepSeek/GLM/Qwen/Kimi
-  reasoning_details?: [{ text: '...' }]  // Minimax
-}
-```
+---
+
+## 60-Second Quick Start
 
-**Configuration-Driven Reasoning** (NEW in v2.7):
+```bash
+npm install @anthropic/kode-sdk
 
-The OpenAI provider now uses a configuration-driven approach instead of hardcoded provider detection:
+# Set your API key
+export ANTHROPIC_API_KEY=sk-...
 
-```typescript
-// ReasoningConfig interface
-interface ReasoningConfig {
-  fieldName?: 'reasoning_content' | 'reasoning_details';  // Response field name
-  requestParams?: Record<string, any>;  // Enable reasoning in request
-  stripFromHistory?: boolean;  // DeepSeek requirement
-}
+# Run the example
+npx ts-node examples/getting-started.ts
 ```
 
-**Provider-Specific Configuration Examples**:
-
 ```typescript
-// DeepSeek (must strip reasoning from history)
-const deepseekProvider = new OpenAIProvider(apiKey, 'deepseek-reasoner', baseUrl, undefined, {
-  reasoningTransport: 'provider',
-  reasoning: {
-    fieldName: 'reasoning_content',
-    stripFromHistory: true,  // Critical: DeepSeek returns 400 if reasoning in history
+import { Agent, AnthropicProvider, LocalSandbox } from '@anthropic/kode-sdk';
+
+const agent = await Agent.create({
+  agentId: 'my-first-agent',
+  template: { systemPrompt: 'You are a helpful assistant.' },
+  deps: {
+    modelProvider: new AnthropicProvider(process.env.ANTHROPIC_API_KEY!),
+    sandbox: new LocalSandbox({ workDir: './workspace' }),
   },
 });
 
-// GLM (thinking.type parameter)
-const glmProvider = new OpenAIProvider(apiKey, 'glm-4.7', baseUrl, undefined, {
-  reasoningTransport: 'provider',
-  reasoning: {
-    fieldName: 'reasoning_content',
-    requestParams: { thinking: { type: 'enabled', clear_thinking: false } },
-  },
-});
-
-// Minimax (reasoning_split parameter, reasoning_details field)
-const minimaxProvider = new OpenAIProvider(apiKey, 'minimax-moe-01', baseUrl, undefined, {
-  reasoningTransport: 'provider',
-  reasoning: {
-    fieldName: 'reasoning_details',
-    requestParams: { reasoning_split: true },
-  },
+// Subscribe to events
+agent.subscribeProgress({ kinds: ['text_chunk'] }, (event) => {
+  process.stdout.write(event.text);
 });
 
-// Qwen (enable_thinking parameter)
-const qwenProvider = new OpenAIProvider(apiKey, 'qwen3-max', baseUrl, undefined, {
-  reasoningTransport: 'provider',
-  reasoning: {
-    fieldName: 'reasoning_content',
-    requestParams: { enable_thinking: true, thinking_budget: 10000 },
-    stripFromHistory: true,  // Similar to DeepSeek
-  },
-});
-
-// Kimi K2 (reasoning parameter)
-const kimiProvider = new OpenAIProvider(apiKey, 'kimi-k2-thinking', baseUrl, undefined, {
-  reasoningTransport: 'provider',
-  reasoning: {
-    fieldName: 'reasoning_content',
-    requestParams: { reasoning: 'enabled' },
-  },
-});
+// Chat with the agent
+await agent.chat('Hello! What can you help me with?');
 ```
 
-**Tool Message Conversion**:
-```typescript
-// Internal tool_result -> OpenAI tool message
-{ role: 'tool', tool_call_id: '...', content: '...', name: '...' }
-```
-
-**Image Handling**:
-- URL images: `{ type: 'image_url', image_url: { url: 'https://...' } }`
-- Base64 images: `{ type: 'image_url', image_url: { url: 'data:image/png;base64,...' } }`
-
-**Reasoning Transport**:
-- `text`: Reasoning wrapped as `<think>...</think>` in text content
-- `provider`: Uses provider-specific fields (configured via `reasoning.fieldName`)
-
 ---
 
-### OpenAI Provider (Responses API)
+## Core Concepts
 
-**API Format**: OpenAI Responses API (GPT-5.x reasoning models)
+### 1. Three-Channel Event System
 
-```typescript
-// Internal -> OpenAI Responses
-{
-  model: 'gpt-5.2',
-  input: [
-    { role: 'user', content: [{ type: 'input_text', text: '...' }] },
-    { role: 'assistant', content: [{ type: 'output_text', text: '...' }] }
-  ],
-  reasoning?: { effort: 'medium' },  // none | minimal | low | medium | high | xhigh
-  store?: true,  // Enable state persistence
-  previous_response_id?: 'resp_...'  // Multi-turn continuation
-}
 ```
-
-**File Handling**:
-```typescript
-// File with ID
-{ type: 'input_file', file_id: '...' }
-// File with URL
-{ type: 'input_file', file_url: '...' }
-// File with base64
-{ type: 'input_file', filename: '...', file_data: 'data:application/pdf;base64,...' }
++-------------+     +-------------+     +-------------+
+|  Progress   |     |   Control   |     |   Monitor   |
++-------------+     +-------------+     +-------------+
+| text_chunk  |     | permission  |     | tool_audit  |
+| tool:start  |     | _required   |     | state_change|
+| tool:complete|    | approval    |     | token_usage |
+| done        |     | _response   |     | error       |
++-------------+     +-------------+     +-------------+
+      |                   |                   |
+      v                   v                   v
+   Your UI         Approval Service     Observability
 ```
 
-**Configuration** (NEW in v2.7):
-```typescript
-const provider = new OpenAIProvider(apiKey, 'gpt-5.2', baseUrl, undefined, {
-  api: 'responses',  // Use Responses API instead of Chat Completions
-  responses: {
-    reasoning: { effort: 'high' },  // Reasoning effort level
-    store: true,  // Enable response storage for continuation
-    previousResponseId: 'resp_abc123',  // Resume from previous response
-  },
-});
+### 2. Crash Recovery & Breakpoints
+
 ```
+Agent Execution Flow:
 
-**Multi-Turn Conversation Flow**:
-```typescript
-// First request
-const response1 = await provider.complete(messages);
-const responseId = response1.metadata?.responseId;  // Store for continuation
+  READY -> PRE_MODEL -> STREAMING -> TOOL_PENDING -> PRE_TOOL -> EXECUTING -> POST_TOOL
+    |         |            |             |              |           |           |
+    +-------- WAL Protected State -------+-- Approval --+---- Tool Execution ---+
 
-// Second request (continues from first)
-provider.configure({
-  responses: { ...provider.getConfig().responses, previousResponseId: responseId }
-});
-const response2 = await provider.complete(newMessages);
+On crash: Resume from last safe breakpoint, auto-seal incomplete tool calls
 ```
 
----
+### 3. Fork & Trajectory Exploration
 
-### Gemini Provider
+```typescript
+// Create a checkpoint at current state
+const checkpointId = await agent.checkpoint('before-decision');
 
-**API Format**: Gemini Generate Content API
+// Fork to explore different paths
+const explorerA = await agent.fork(checkpointId);
+const explorerB = await agent.fork(checkpointId);
 
-```typescript
-// Internal -> Gemini
-{
-  contents: [
-    {
-      role: 'user' | 'model',
-      parts: [
-        { text: '...' },
-        { inline_data: { mime_type: '...', data: '...' } },  // Base64 images/files
-        { file_data: { mime_type: '...', file_uri: 'gs://...' } },  // GCS files
-        { functionCall: { name: '...', args: {...} } },
-        { functionResponse: { name: '...', response: { content: '...' } } }
-      ]
-    }
-  ],
-  systemInstruction?: { parts: [{ text: '...' }] },
-  tools?: [{ functionDeclarations: [...] }],
-  generationConfig?: {
-    temperature: 0.7,
-    maxOutputTokens: 4096,
-    thinkingConfig: { thinkingLevel: 'HIGH' }  // Gemini 3.x
-  }
-}
+await explorerA.chat('Try approach A');
+await explorerB.chat('Try approach B');
 ```
 
-**Role Mapping**:
-- `assistant` -> `model`
-- `user` -> `user`
-- `system` -> `systemInstruction`
+---
 
-**Thinking Configuration** (Gemini 3.x):
-```typescript
-const provider = new GeminiProvider(apiKey, model, baseUrl, proxyUrl, {
-  thinking: {
-    level: 'high'  // minimal | low | medium | high -> MINIMAL | LOW | MEDIUM | HIGH
-  }
-});
-```
+## Examples
 
-**Tool Schema Sanitization**: Gemini requires clean JSON Schema without:
-- `additionalProperties`
-- `$schema`
-- `$defs`
-- `definitions`
+| Example | Description | Key Features |
+|---------|-------------|--------------|
+| `npm run example:getting-started` | Minimal chat loop | Progress stream, basic setup |
+| `npm run example:agent-inbox` | Event-driven inbox | Todo management, tool concurrency |
+| `npm run example:approval` | Approval workflow | Control channel, hooks, policies |
+| `npm run example:room` | Multi-agent collaboration | AgentPool, Room, Fork |
+| `npm run example:scheduler` | Long-running with reminders | Scheduler, step triggers |
+| `npm run example:nextjs` | Next.js API integration | Resume-or-create, SSE streaming |
 
 ---
 
-### High-Speed Inference Providers (Groq/Cerebras)
+## Architecture for Scale
 
-Both Groq and Cerebras provide OpenAI-compatible APIs with extremely fast inference speeds:
+For production deployments serving many users, we recommend the **Worker Microservice Pattern**:
 
-**Groq** (LPU Inference Engine):
-```typescript
-const groqProvider = new OpenAIProvider(apiKey, 'llama-3.3-70b-versatile', undefined, undefined, {
-  baseUrl: 'https://api.groq.com/openai/v1',  // Auto-appends /v1
-  // Reasoning models (Qwen 3 32B, QwQ-32B)
-  reasoning: {
-    fieldName: 'reasoning_content',
-    requestParams: { reasoning_format: 'parsed', reasoning_effort: 'default' },
-  },
-});
-// Speed: ~276 tokens/second for Llama 3.3 70B
 ```
-
-**Cerebras** (Wafer-Scale Engine):
-```typescript
-const cerebrasProvider = new OpenAIProvider(apiKey, 'qwen-3-32b', undefined, undefined, {
-  baseUrl: 'https://api.cerebras.ai/v1',
-  reasoning: {
-    fieldName: 'reasoning_content',
-    requestParams: { reasoning_format: 'separate' },
-  },
-});
-// Speed: ~2,600 tokens/second for Qwen3 32B
+                        +------------------+
+                        |    Frontend      |  Next.js / SvelteKit (Vercel OK)
+                        +--------+---------+
+                                 |
+                        +--------v---------+
+                        |   API Gateway    |  Auth, routing, queue push
+                        +--------+---------+
+                                 |
+                        +--------v---------+
+                        |  Message Queue   |  Redis / SQS / NATS
+                        +--------+---------+
+                                 |
+            +--------------------+--------------------+
+            |                    |                    |
+   +--------v-------+   +--------v-------+   +--------v-------+
+   |   Worker 1     |   |   Worker 2     |   |   Worker N     |
+   | (KODE SDK)     |   | (KODE SDK)     |   | (KODE SDK)     |
+   | Long-running   |   | Long-running   |   | Long-running   |
+   +--------+-------+   +--------+-------+   +--------+-------+
+            |                    |                    |
+            +--------------------+--------------------+
+                                 |
+                        +--------v---------+
+                        | Distributed Store|  PostgreSQL / Redis
+                        +------------------+
 ```
 
-**Key Features**:
-- OpenAI-compatible Chat Completions API
-- Tool calling with `strict` mode (Cerebras)
-- Streaming support with SSE
-- Rate limits: 50 RPM (Cerebras), higher for Groq paid tiers
-
----
-
-## ReasoningTransport Options
+**Key Principles:**
+1. **API layer is stateless** - Can run on serverless
+2. **Workers are stateful** - Run KODE SDK, need long-running processes
+3. **Store is shared** - Single source of truth for agent state
+4. **Queue decouples** - Request handling from agent execution
 
-Controls how thinking/reasoning content is handled across providers:
-
-| Transport | Description | Use Case |
-|-----------|-------------|----------|
-| `provider` | Native provider format (Anthropic thinking blocks, OpenAI reasoning tokens) | Full thinking visibility, multi-turn conversations |
-| `text` | Wrapped in `<think>...</think>` tags as text | Cross-provider compatibility, text-based pipelines |
-| `omit` | Excluded from message history | Privacy, token reduction |
+See [docs/ARCHITECTURE.md](./docs/ARCHITECTURE.md) for detailed deployment guides.
 
 ---
 
-## Prompt Caching
+## Documentation
 
-The SDK supports prompt caching across multiple providers for significant cost savings:
+| Document | Description |
+|----------|-------------|
+| [Architecture Guide](./docs/ARCHITECTURE.md) | Mental model, deployment patterns, scaling strategies |
+| [Quickstart](./docs/quickstart.md) | Step-by-step first agent |
+| [Events System](./docs/events.md) | Three-channel event model |
+| [API Reference](./docs/api.md) | Core types and interfaces |
+| [Playbooks](./docs/playbooks.md) | Common patterns and recipes |
+| [Deployment](./docs/DEPLOYMENT.md) | Production deployment guide |
+| [Roadmap](./docs/ROADMAP.md) | Future development plans |
 
-| Provider | Caching Type | Min Tokens | TTL | Savings |
-|----------|--------------|------------|-----|---------|
-| Anthropic | Explicit (`cache_control`) | 1024-4096 | 5m/1h | 90% |
-| OpenAI | Automatic | 1024 | 24h | 75% |
-| Gemini | Implicit + Explicit | 256-4096 | Custom | 75% |
-| DeepSeek | Automatic prefix | 64 | Hours | 90% |
-| Qwen | Explicit (`cache_control`) | 1024 | 5m | 90% |
+### Scenario Guides
 
-**Anthropic Cache Example**:
-```typescript
-const provider = new AnthropicProvider(apiKey, 'claude-sonnet-4.5', baseUrl, undefined, {
-  beta: { extendedCacheTtl: true },  // Enable 1-hour TTL
-  cache: { breakpoints: 4, defaultTtl: '1h' },
-});
+| Scenario | Guide |
+|----------|-------|
+| CLI Tools | [docs/scenarios/cli-tools.md](./docs/scenarios/cli-tools.md) |
+| Desktop Apps | [docs/scenarios/desktop-apps.md](./docs/scenarios/desktop-apps.md) |
+| IDE Plugins | [docs/scenarios/ide-plugins.md](./docs/scenarios/ide-plugins.md) |
+| Web Backend | [docs/scenarios/web-backend.md](./docs/scenarios/web-backend.md) |
+| Large-scale ToC | [docs/scenarios/large-scale-toc.md](./docs/scenarios/large-scale-toc.md) |
 
-// Mark content for caching in messages
-const messages = [{
-  role: 'user',
-  content: [{
-    type: 'text',
-    text: 'Large document...',
-    cacheControl: { type: 'ephemeral', ttl: '1h' }
-  }]
-}];
-```
+---
 
-**Usage Tracking**:
-```typescript
-const response = await provider.complete(messages);
-console.log(response.usage);
-// {
-//   input_tokens: 100,
-//   cache_creation_input_tokens: 50000,  // First request
-//   cache_read_input_tokens: 50000,      // Subsequent requests
-//   output_tokens: 500
-// }
-```
+## Supported Providers
+
+| Provider | Streaming | Tool Calling | Thinking/Reasoning |
+|----------|-----------|--------------|-------------------|
+| **Anthropic** | SSE | Native | Extended Thinking |
+| **OpenAI** | SSE | Function Calling | o1/o3 reasoning |
+| **Gemini** | SSE | Function Calling | thinkingLevel |
+| **DeepSeek** | SSE | OpenAI-compatible | reasoning_content |
+| **Qwen** | SSE | OpenAI-compatible | thinking_budget |
+| **Groq/Cerebras** | SSE | OpenAI-compatible | - |
 
 ---
 
-## Session Compression & Resume
+## Roadmap
 
-The SDK's context management works with all providers:
+### v2.8 - Storage Foundation
+- PostgresStore with connection pooling
+- Distributed locking (Advisory Lock)
+- Graceful shutdown support
 
-1. **Message Windowing**: Automatically manages context window limits per provider
-2. **Safe Fork Points**: Natural breakpoints at tool results and pure text responses
-3. **Reasoning Preservation**: Thinking blocks can be:
-   - Preserved with signatures (Anthropic)
-   - Converted to text tags (cross-provider)
-   - Omitted for compression
+### v3.0 - Performance
+- Incremental message storage (append-only)
+- Copy-on-Write fork optimization
+- Event sampling and aggregation
 
-```typescript
-// Resume with thinking context
-const agent = await Agent.resume(sessionId, {
-  provider: new AnthropicProvider(apiKey, model, baseUrl, undefined, {
-    reasoningTransport: 'provider'  // Preserve thinking for continuation
-  })
-});
-```
+### v3.5 - Distributed
+- Agent Scheduler with LRU caching
+- Distributed EventBus (Redis Pub/Sub)
+- Worker mode helpers
 
----
+See [docs/ROADMAP.md](./docs/ROADMAP.md) for the complete roadmap.
 
-## Testing Providers
+---
 
-Run integration tests with real API connections:
+## Contributing
 
-```bash
-# Configure credentials
-cp .env.test.example .env.test
-# Edit .env.test with your API keys
+We welcome contributions! Please see [CONTRIBUTING.md](./CONTRIBUTING.md) for guidelines.
 
-# Run all provider tests
-npm test -- --testPathPattern="multi-provider"
+## License
 
-# Run specific provider
-npm test -- --testPathPattern="multi-provider" --testNamePattern="Anthropic"
-```
+MIT License - see [LICENSE](./LICENSE) for details.
 
 ---
 
-## 下一步
-
-- 使用 `examples/` 作为蓝本接入你自己的工具、存储、审批系统。
-- 将 Monitor 事件接入现有 observability 平台，沉淀治理与审计能力。
-- 参考 `docs/` 中的扩展指南，为企业自定义 Sandbox、模型 Provider 或多团队 Agent 协作流程。
-
-欢迎在 Issue / PR 中分享反馈与场景诉求，让 KODE SDK 更贴近真实协作团队的需求。
+**KODE SDK** - *The runtime kernel that lets you build agents that persist, recover, and explore.*
diff --git a/docs/ARCHITECTURE.md b/docs/ARCHITECTURE.md
new file mode 100644
index 0000000..b4fd278
--- /dev/null
+++ b/docs/ARCHITECTURE.md
@@ -0,0 +1,572 @@
+# KODE SDK Architecture Guide
+
+> Deep dive into the mental model, design decisions, and deployment patterns for KODE SDK.
+
+---
+
+## Table of Contents
+
+1. [Mental Model](#mental-model)
+2. [Core Architecture](#core-architecture)
+3. [Runtime Characteristics](#runtime-characteristics)
+4. [Deployment Patterns](#deployment-patterns)
+5. [Scaling Strategies](#scaling-strategies)
+6. [Decision Framework](#decision-framework)
+
+---
+
+## Mental Model
+
+### What KODE SDK Is
+
+```
+Think of KODE SDK like:
+
++------------------+     +------------------+     +------------------+
+|       V8         |     |     SQLite       |     |    KODE SDK      |
+|  JS Runtime      |     |  Database Engine |     |  Agent Runtime   |
++------------------+     +------------------+     +------------------+
+        |                        |                        |
+        v                        v                        v
++------------------+     +------------------+     +------------------+
+|    Express.js    |     |     Prisma       |     |   Your App       |
+|  Web Framework   |     |       ORM        |     | (CLI/Desktop/Web)|
++------------------+     +------------------+     +------------------+
+        |                        |                        |
+        v                        v                        v
++------------------+     +------------------+     +------------------+
+|      Vercel      |     |   PlanetScale    |     |   Your Infra     |
+|  Cloud Platform  |     |  Cloud Database  |     | (K8s/EC2/Local)  |
++------------------+     +------------------+     +------------------+
+```
+
+**KODE SDK is an engine, not a platform.**
+
+It provides:
+- Agent lifecycle management (create, run, pause, resume, fork)
+- State persistence (via pluggable Store interface)
+- Tool execution and permission governance
+- Event streams for observability
+
+It does NOT provide:
+- HTTP routing or API framework
+- User authentication or authorization
+- Multi-tenancy or resource isolation
+- Horizontal scaling or load balancing
+
+### The Single Responsibility
+
+```
+                     KODE SDK's Job
+                           |
+                           v
+    +----------------------------------------------+
+    |                                              |
+    |   "Keep this agent running, recover from    |
+    |    crashes, let it fork, and tell me        |
+    |    what's happening via events."            |
+    |                                              |
+    +----------------------------------------------+
+                           |
+                           v
+                     Your App's Job
+                           |
+                           v
+    +----------------------------------------------+
+    |                                              |
+    |   "Handle users, route requests, manage     |
+    |    permissions, scale infrastructure,       |
+    |    and integrate with my systems."          |
+    |                                              |
+    +----------------------------------------------+
+```
+
+---
+
+## Core Architecture
+
+### Component Overview
+
+```
++------------------------------------------------------------------+
+|                         Agent Instance                            |
++------------------------------------------------------------------+
+|                                                                   |
+|  +------------------+  +------------------+  +------------------+ |
+|  |  MessageQueue    |  | ContextManager   |  |   ToolRunner     | |
+|  |  (User inputs)   |  | (Token mgmt)     |  | (Parallel exec)  | |
+|  +--------+---------+  +--------+---------+  +--------+---------+ |
+|           |                     |                     |           |
+|           +---------------------+---------------------+           |
+|                                 |                                 |
+|                    +------------v------------+                    |
+|                    |      BreakpointManager  |                    |
+|                    |   (7-stage state track) |                    |
+|                    +------------+------------+                    |
+|                                 |                                 |
+|  +------------------+  +--------v---------+  +------------------+ |
+|  | PermissionManager|  |     EventBus     |  |   TodoManager    | |
+|  | (Approval flow)  |  | (3-channel emit) |  | (Task tracking)  | |
+|  +------------------+  +------------------+  +------------------+ |
+|                                                                   |
++----------------------------------+--------------------------------+
+                                   |
+                    +--------------+--------------+
+                    |              |              |
+           +--------v------+ +----v----+ +-------v-------+
+           |     Store     | | Sandbox | | ModelProvider |
+           | (Persistence) | | (Exec)  | | (LLM calls)   |
+           +---------------+ +---------+ +---------------+
+```
+
+### Data Flow
+
+```
+User Message
+     |
+     v
++----+----+     +-----------+     +------------+
+| Message |---->|  Context  |---->|   Model    |
+|  Queue  |     |  Manager  |     |  Provider  |
++---------+     +-----------+     +-----+------+
+                                        |
+                              +---------+---------+
+                              |                   |
+                         Text Response      Tool Calls
+                              |                   |
+                              v                   v
+                    +---------+------+    +------+-------+
+                    |    EventBus    |    |  ToolRunner  |
+                    | (text_chunk)   |    | (parallel)   |
+                    +----------------+    +------+-------+
+                                                 |
+                              +------------------+------------------+
+                              |                  |                  |
+                         Permission         Execution          Result
+                           Check              (Sandbox)        Handling
+                              |                  |                  |
+                              v                  v                  v
+                    +--------------------+  +---------+  +------------------+
+                    | PermissionManager  |  | Sandbox |  |    EventBus      |
+                    | (Control channel)  |  | (exec)  |  | (tool:complete)  |
+                    +--------------------+  +---------+  +------------------+
+```
+
+### State Persistence (WAL)
+
+```
+Every State Change
+        |
+        v
++-------+-------+
+|  Write-Ahead  |
+|     Log       |  <-- Write first (fast, append-only)
++-------+-------+
+        |
+        v
++-------+-------+
+|   Main File   |  <-- Then update (can be slow)
++-------+-------+
+        |
+        v
++-------+-------+
+|  Delete WAL   |  <-- Finally cleanup
++-------+-------+
+
+On Crash Recovery:
+1. Scan for WAL files
+2. If WAL exists but main file incomplete -> Restore from WAL
+3. Delete WAL after successful restore
+```
+
+---
+
+## Runtime Characteristics
+
+### Memory Model
+
+```
+Agent Memory Footprint (Typical):
+
++---------------------------+
+|     Agent Instance        |
++---------------------------+
+| messages[]: 10KB - 2MB    |  <-- Grows with conversation
+| toolRecords: 1KB - 100KB  |  <-- Grows with tool usage
+| eventTimeline: 5KB - 500KB|  <-- Recent events cached
+| mediaCache: 0 - 10MB      |  <-- If images/files involved
+| baseObjects: ~50KB        |  <-- Fixed overhead
++---------------------------+
+
+Typical range: 100KB - 5MB per agent
+AgentPool (50 agents): 5MB - 250MB
+```
+
+### I/O Patterns
+
+```
+Per Agent Step:
+
++-------------------+     +-------------------+     +-------------------+
+| persistMessages() |     | persistToolRecs() |     | emitEvents()      |
+| ~20-50ms (SSD)    |     | ~5-10ms           |     | ~1-5ms (buffered) |
++-------------------+     +-------------------+     +-------------------+
+
+Total per step: 30-70ms I/O overhead
+
+At Scale (100 concurrent agents):
+- Sequential bottleneck in JSONStore
+- Need distributed Store for parallel writes
+```
+
+### Event Loop Impact
+
+```
+Agent Processing:
+
+   +---------+
+   |  IDLE   |  <-- Agent waiting for input
+   +----+----+
+        |
+   +----v----+
+   | PROCESS |  <-- Model call (async, non-blocking)
+   +----+----+
+        |
+   +----v----+
+   |  TOOL   |  <-- Tool execution (may block if sync)
+   +----+----+
+        |
+   +----v----+
+   | PERSIST |  <-- File I/O (async)
+   +----+----+
+        |
+        v
+   +---------+
+   |  IDLE   |
+   +---------+
+
+Key: All heavy operations are async
+Risk: Sync operations in custom tools can block event loop
+```
+
+---
+
+## Deployment Patterns
+
+### Pattern 1: Single Process (CLI/Desktop)
+
+```
++------------------------------------------------------------------+
+|                        Your Application                           |
++------------------------------------------------------------------+
+|                                                                   |
+|   +------------------+                                            |
+|   |   KODE SDK       |                                            |
+|   |   +----------+   |                                            |
+|   |   | Agent(s) |   |                                            |
+|   |   +----------+   |                                            |
+|   |   | JSONStore|   |  --> Local filesystem                      |
+|   |   +----------+   |                                            |
+|   +------------------+                                            |
+|                                                                   |
++------------------------------------------------------------------+
+
+Best for: CLI tools, Electron apps, VSCode extensions
+Agents: 1-50 concurrent
+Users: Single user
+Persistence: Local files
+```
+
+### Pattern 2: Single Server (Self-hosted)
+
+```
++------------------------------------------------------------------+
+|                          Server                                   |
++------------------------------------------------------------------+
+|                                                                   |
+|   +------------------+     +------------------+                   |
+|   |   HTTP Layer     |     |   KODE SDK       |                   |
+|   |   (Express/etc)  |---->|   AgentPool      |                   |
+|   +------------------+     +------------------+                   |
+|                                   |                               |
+|                            +------v------+                        |
+|                            |  JSONStore  |  --> Local filesystem  |
+|                            +-------------+                        |
+|                                                                   |
++------------------------------------------------------------------+
+
+Best for: Internal tools, small teams, prototypes
+Agents: 10-100 concurrent
+Users: <100 concurrent
+Persistence: Local files (can use Redis/Postgres)
+```
+
+### Pattern 3: Worker Microservice (Scalable)
+
+```
++------------------------------------------------------------------+
+|                         Load Balancer                             |
++----------------------------------+--------------------------------+
+                                   |
+         +-------------------------+-------------------------+
+         |                         |                         |
++--------v--------+     +----------v--------+     +----------v------+
+|   API Server 1  |     |   API Server 2    |     |   API Server N  |
+|   (Stateless)   |     |   (Stateless)     |     |   (Stateless)   |
++--------+--------+     +----------+--------+     +----------+------+
+         |                         |                         |
+         +-------------------------+-------------------------+
+                                   |
+                          +--------v--------+
+                          |  Message Queue  |
+                          |  (Redis/SQS)    |
+                          +--------+--------+
+                                   |
+         +-------------------------+-------------------------+
+         |                         |                         |
++--------v--------+     +----------v--------+     +----------v------+
+|   Worker 1      |     |   Worker 2        |     |   Worker N      |
+|   +----------+  |     |   +----------+    |     |   +----------+  |
+|   | KODE SDK |  |     |   | KODE SDK |    |     |   | KODE SDK |  |
+|   | AgentPool|  |     |   | AgentPool|    |     |   | AgentPool|  |
+|   +----------+  |     |   +----------+    |     |   +----------+  |
++--------+--------+     +----------+--------+     +----------+------+
+         |                         |                         |
+         +-------------------------+-------------------------+
+                                   |
+                          +--------v--------+
+                          | Distributed     |
+                          | Store           |
+                          | (PostgreSQL)    |
+                          +-----------------+
+
+Best for: Production ToC apps, SaaS platforms
+Agents: 1000+ concurrent
+Users: 10K+ concurrent
+Persistence: PostgreSQL/Redis with distributed locks
+```
+
+### Pattern 4: Hybrid Serverless (API + Workers)
+
+```
++------------------------------------------------------------------+
+|                    Serverless Platform (Vercel)                   |
++------------------------------------------------------------------+
+|                                                                   |
+|   +------------------+                                            |
+|   |  /api/chat       |  --> Validate, enqueue, return task ID    |
+|   +------------------+                                            |
+|   |  /api/status     |  --> Check task status from DB            |
+|   +------------------+                                            |
+|   |  /api/stream     |  --> SSE from Redis Pub/Sub               |
+|   +------------------+                                            |
+|                                                                   |
++----------------------------------+--------------------------------+
+                                   |
+                          +--------v--------+
+                          |  Message Queue  |
+                          |  (Upstash Redis)|
+                          +--------+--------+
+                                   |
++----------------------------------v--------------------------------+
+|                    Worker Platform (Railway/Render)               |
++------------------------------------------------------------------+
+|                                                                   |
+|   +------------------+                                            |
+|   |   Worker Pool    |                                            |
+|   |   +----------+   |                                            |
+|   |   | KODE SDK |   |                                            |
+|   |   | Agents   |   |                                            |
+|   |   +----------+   |                                            |
+|   +------------------+                                            |
+|                                                                   |
++------------------------------------------------------------------+
+
+Best for: Serverless frontend + stateful backend
+API: Serverless (fast, scalable, cheap)
+Agents: Long-running workers (Railway, Render, Fly.io)
+```
+
+---
+
+## Scaling Strategies
+
+### Strategy 1: Vertical Scaling (Single Node)
+
+```
+Applicable: Up to ~100 concurrent agents
+
+Optimizations:
+1. Increase AgentPool maxAgents
+2. Use Redis for Store (faster than files)
+3. Add memory (agents are memory-bound)
+4. Use SSD for persistence
+
+const pool = new AgentPool({
+  maxAgents: 100,  // Increase from default 50
+  store: new RedisStore({ ... }),
+});
+```
+
+### Strategy 2: Agent Sharding (Multi-Node)
+
+```
+Applicable: 100-1000 concurrent agents
+
+Architecture:
+- Hash agentId to determine which worker handles it
+- Consistent hashing for minimal reshuffling
+- Each worker owns a shard of agents
+
+                    agentId: "user-123-agent-456"
+                              |
+                              v
+                    hash(agentId) % N = worker_index
+                              |
+              +---------------+---------------+
+              |               |               |
+         Worker 0        Worker 1        Worker 2
+        (agents 0-33)   (agents 34-66)  (agents 67-99)
+```
+
+### Strategy 3: Agent Scheduling (LRU)
+
+```
+Applicable: 1000+ total agents, limited active
+
+Concept:
+- Not all agents are active simultaneously
+- Keep hot agents in memory
+- Hibernate cold agents to storage
+- Resume on demand
+
+class AgentScheduler {
+  private active: LRUCache<string, Agent>;  // In memory
+  private hibernated: Set<string>;           // In storage
+
+  async get(agentId: string): Promise<Agent> {
+    // Check active cache
+    if (this.active.has(agentId)) {
+      return this.active.get(agentId);
+    }
+
+    // Resume from storage
+    const agent = await Agent.resume(agentId, config, deps);
+    this.active.set(agentId, agent);
+
+    // LRU eviction handles hibernation
+    return agent;
+  }
+}
+```
+
+### Strategy 4: Fork Optimization (COW)
+
+```
+Applicable: Heavy fork usage (exploration scenarios)
+
+Current: O(n) deep copy of messages
+Optimized: O(1) copy-on-write
+
+Before:
+  fork() {
+    const forked = JSON.parse(JSON.stringify(messages));  // O(n)
+  }
+
+After:
+  fork() {
+    const forkedHead = currentHead;  // O(1) pointer copy
+    // Messages are immutable, share until modified
+  }
+```
+
+---
+
+## Decision Framework
+
+### When to Use KODE SDK Directly
+
+```
++------------------+
+|  Decision Tree   |
++------------------+
+         |
+         v
++------------------+
+| Single user/     |----YES---> Use directly (Pattern 1)
+| local machine?   |
++--------+---------+
+         | NO
+         v
++------------------+
+| < 100 concurrent |----YES---> Single server (Pattern 2)
+| users?           |
++--------+---------+
+         | NO
+         v
++------------------+
+| Can run long-    |----YES---> Worker microservice (Pattern 3)
+| running processes?|
++--------+---------+
+         | NO
+         v
++------------------+
+| Serverless only? |----YES---> Hybrid pattern (Pattern 4)
++--------+---------+
+         | NO
+         v
++------------------+
+| Consider other   |
+| solutions        |
++------------------+
+```
+
+### Platform Compatibility Matrix
+
+| Platform | Compatible | Notes |
+|----------|------------|-------|
+| Node.js | 100% | Primary target |
+| Bun | 95% | Minor adjustments needed |
+| Deno | 80% | Permission flags required |
+| Electron | 90% | Use in main process |
+| VSCode Extension | 85% | workspace.fs integration |
+| Vercel Functions | 20% | API layer only, not agents |
+| Cloudflare Workers | 5% | Not compatible |
+| Browser | 10% | No fs/process, very limited |
+
+### Store Selection Guide
+
+| Store | Use Case | Throughput | Scaling |
+|-------|----------|------------|---------|
+| JSONStore | Development, CLI | Low | Single node |
+| SQLiteStore | Desktop apps | Medium | Single node |
+| RedisStore | Small-medium production | High | Single node |
+| PostgresStore | Production, multi-node | High | Multi-node |
+
+---
+
+## Summary
+
+### Core Principles
+
+1. **KODE SDK is a runtime kernel** - It manages agent lifecycle, not application infrastructure
+
+2. **Agents are stateful** - They need persistent storage and long-running processes
+
+3. **Scale through architecture** - Use worker patterns for large-scale deployments
+
+4. **Store is pluggable** - Implement custom Store for your infrastructure
+
+### Quick Reference
+
+| Scenario | Pattern | Store | Scale |
+|----------|---------|-------|-------|
+| CLI tool | Single Process | JSONStore | 1 user |
+| Desktop app | Single Process | SQLiteStore | 1 user |
+| Internal tool | Single Server | RedisStore | ~100 users |
+| SaaS product | Worker Microservice | PostgresStore | 10K+ users |
+| Serverless app | Hybrid | External DB | Varies |
+
+---
+
+*Next: See [Deployment Guide](./DEPLOYMENT.md) for implementation details.*
diff --git a/docs/DEPLOYMENT.md b/docs/DEPLOYMENT.md
new file mode 100644
index 0000000..4a03a1f
--- /dev/null
+++ b/docs/DEPLOYMENT.md
@@ -0,0 +1,638 @@
+# Deployment Scenarios & Architecture Patterns
+
+This document covers deployment patterns for KODE SDK across different use cases, from CLI tools to production backends.
+
+---
+
+## Scenario Overview
+
+| Scenario | Complexity | Store | Scalability | Example |
+|----------|-----------|-------|-------------|---------|
+| CLI Tool | Low | JSONStore | Single user | Claude Code |
+| Desktop App | Low | JSONStore | Single user | ChatGPT Desktop |
+| IDE Plugin | Low | JSONStore | Single user | Cursor |
+| Self-hosted Server | Medium | JSONStore/Custom | ~100 concurrent | Internal tool |
+| Production Backend | High | PostgreSQL/Redis | 1000+ concurrent | SaaS product |
+| Serverless | High | External DB | Auto-scaling | API service |
+
+---
+
+## Scenario 1: CLI Tool
+
+**Characteristics:**
+- Single user, single process
+- Local file system available
+- Long-running process
+- No external dependencies needed
+
+**Architecture:**
+```
+┌─────────────────────────────┐
+│         Terminal            │
+│  ┌───────────────────────┐  │
+│  │      CLI App          │  │
+│  │  ┌─────────────────┐  │  │
+│  │  │   KODE SDK      │  │  │
+│  │  │  ┌───────────┐  │  │  │
+│  │  │  │ JSONStore │  │  │  │
+│  │  │  └─────┬─────┘  │  │  │
+│  │  └────────┼────────┘  │  │
+│  └───────────┼───────────┘  │
+└──────────────┼──────────────┘
+               │
+        ┌──────▼──────┐
+        │ Local Files │
+        │ ~/.my-agent │
+        └─────────────┘
+```
+
+**Implementation:**
+```typescript
+import { Agent, AgentPool, JSONStore } from '@shareai-lab/kode-sdk';
+
+const store = new JSONStore(path.join(os.homedir(), '.my-agent'));
+const pool = new AgentPool({
+  dependencies: { store, templateRegistry, sandboxFactory, toolRegistry }
+});
+
+// Resume or create
+const agent = await pool.get('main')
+  ?? await pool.create('main', { templateId: 'cli-assistant' });
+
+// Interactive loop
+const rl = readline.createInterface({ input: stdin, output: stdout });
+for await (const line of rl) {
+  await agent.send(line);
+  for await (const event of agent.subscribeProgress()) {
+    if (event.type === 'text_chunk') process.stdout.write(event.delta);
+  }
+}
+```
+
+**Best for:** Developer tools, automation scripts, personal assistants.
+
+---
+
+## Scenario 2: Desktop App (Electron)
+
+**Characteristics:**
+- Single user
+- Full file system access
+- Can run background processes
+- May need multiple agents
+
+**Architecture:**
+```
+┌────────────────────────────────────────────┐
+│              Electron App                  │
+│  ┌──────────────────────────────────────┐  │
+│  │           Renderer Process           │  │
+│  │  ┌──────────────────────────────┐    │  │
+│  │  │            React UI          │    │  │
+│  │  └──────────────┬───────────────┘    │  │
+│  └─────────────────┼────────────────────┘  │
+│                    │ IPC                    │
+│  ┌─────────────────▼────────────────────┐  │
+│  │            Main Process              │  │
+│  │  ┌──────────────────────────────┐    │  │
+│  │  │         AgentPool            │    │  │
+│  │  │  ┌────────┐ ┌────────┐      │    │  │
+│  │  │  │Agent 1 │ │Agent 2 │ ...  │    │  │
+│  │  │  └────────┘ └────────┘      │    │  │
+│  │  └──────────────────────────────┘    │  │
+│  │  ┌──────────────────────────────┐    │  │
+│  │  │         JSONStore            │    │  │
+│  │  └──────────────┬───────────────┘    │  │
+│  └─────────────────┼────────────────────┘  │
+└────────────────────┼────────────────────────┘
+                     │
+              ┌──────▼──────┐
+              │  userData   │
+              │   folder    │
+              └─────────────┘
+```
+
+**Implementation:**
+```typescript
+// main.ts (Main Process)
+import { AgentPool, JSONStore } from '@shareai-lab/kode-sdk';
+import { app, ipcMain } from 'electron';
+
+const store = new JSONStore(path.join(app.getPath('userData'), 'agents'));
+const pool = new AgentPool({ dependencies: { store, ... } });
+
+ipcMain.handle('agent:send', async (event, { agentId, message }) => {
+  const agent = pool.get(agentId) ?? await pool.create(agentId, config);
+  await agent.send(message);
+  return agent.complete();
+});
+
+ipcMain.on('agent:subscribe', (event, { agentId }) => {
+  const agent = pool.get(agentId);
+  if (!agent) return;
+
+  (async () => {
+    for await (const ev of agent.subscribeProgress()) {
+      event.sender.send(`agent:event:${agentId}`, ev);
+    }
+  })();
+});
+```
+
+**Best for:** Chat applications, productivity tools, AI assistants.
+
+---
+
+## Scenario 3: Self-hosted Server (Single Node)
+
+**Characteristics:**
+- Multiple users
+- Persistent server process
+- Can use local storage
+- Moderate concurrency (<100 users)
+
+**Architecture:**
+```
+┌──────────────────────────────────────────────────┐
+│                   Node.js Server                 │
+│  ┌────────────────────────────────────────────┐  │
+│  │              Express/Hono                  │  │
+│  │  ┌──────────────────────────────────────┐  │  │
+│  │  │  /api/agents/:id/message  (POST)     │  │  │
+│  │  │  /api/agents/:id/events   (SSE)      │  │  │
+│  │  └──────────────────────────────────────┘  │  │
+│  └────────────────────┬───────────────────────┘  │
+│                       │                          │
+│  ┌────────────────────▼───────────────────────┐  │
+│  │               AgentPool (50)               │  │
+│  │  ┌────┐ ┌────┐ ┌────┐ ┌────┐ ┌────┐      │  │
+│  │  │ A1 │ │ A2 │ │ A3 │ │... │ │A50 │      │  │
+│  │  └────┘ └────┘ └────┘ └────┘ └────┘      │  │
+│  └────────────────────┬───────────────────────┘  │
+│                       │                          │
+│  ┌────────────────────▼───────────────────────┐  │
+│  │              JSONStore                     │  │
+│  └────────────────────┬───────────────────────┘  │
+└───────────────────────┼──────────────────────────┘
+                        │
+                 ┌──────▼──────┐
+                 │  /data/     │
+                 │   agents    │
+                 └─────────────┘
+```
+
+**Implementation:**
+```typescript
+import { Hono } from 'hono';
+import { streamSSE } from 'hono/streaming';
+import { AgentPool, JSONStore } from '@shareai-lab/kode-sdk';
+
+const app = new Hono();
+const store = new JSONStore('/data/agents');
+const pool = new AgentPool({ dependencies: { store, ... }, maxAgents: 50 });
+
+// Send message
+app.post('/api/agents/:id/message', async (c) => {
+  const { id } = c.req.param();
+  const { message } = await c.req.json();
+
+  let agent = pool.get(id);
+  if (!agent) {
+    const exists = await store.exists(id);
+    agent = exists
+      ? await pool.resume(id, getConfig())
+      : await pool.create(id, getConfig());
+  }
+
+  await agent.send(message);
+  const result = await agent.complete();
+  return c.json(result);
+});
+
+// SSE events
+app.get('/api/agents/:id/events', async (c) => {
+  const { id } = c.req.param();
+  const agent = pool.get(id);
+  if (!agent) return c.json({ error: 'Agent not found' }, 404);
+
+  return streamSSE(c, async (stream) => {
+    for await (const event of agent.subscribeProgress()) {
+      await stream.writeSSE({ data: JSON.stringify(event) });
+    }
+  });
+});
+
+export default app;
+```
+
+**Scaling Limit:** ~50-100 concurrent agents per process. Beyond this, consider worker architecture.
+
+---
+
+## Scenario 4: Production Backend (Multi-node)
+
+**Characteristics:**
+- High concurrency (1000+ users)
+- Multiple server instances
+- Database-backed persistence
+- Queue-based processing
+
+**Architecture:**
+```
+┌─────────────────────────────────────────────────────────────────┐
+│                        Load Balancer                            │
+└────────────────────────────┬────────────────────────────────────┘
+                             │
+         ┌───────────────────┼───────────────────┐
+         │                   │                   │
+┌────────▼────────┐ ┌────────▼────────┐ ┌────────▼────────┐
+│   API Server 1  │ │   API Server 2  │ │   API Server N  │
+│   (Stateless)   │ │   (Stateless)   │ │   (Stateless)   │
+└────────┬────────┘ └────────┬────────┘ └────────┬────────┘
+         │                   │                   │
+         └───────────────────┼───────────────────┘
+                             │
+                    ┌────────▼────────┐
+                    │   Job Queue     │
+                    │   (BullMQ)      │
+                    └────────┬────────┘
+                             │
+         ┌───────────────────┼───────────────────┐
+         │                   │                   │
+┌────────▼────────┐ ┌────────▼────────┐ ┌────────▼────────┐
+│   Worker 1      │ │   Worker 2      │ │   Worker N      │
+│  AgentPool(50)  │ │  AgentPool(50)  │ │  AgentPool(50)  │
+└────────┬────────┘ └────────┬────────┘ └────────┬────────┘
+         │                   │                   │
+         └───────────────────┼───────────────────┘
+                             │
+              ┌──────────────┼──────────────┐
+              │              │              │
+       ┌──────▼──────┐ ┌─────▼─────┐ ┌─────▼─────┐
+       │  PostgreSQL │ │   Redis   │ │    S3     │
+       │   (Store)   │ │  (Cache)  │ │  (Files)  │
+       └─────────────┘ └───────────┘ └───────────┘
+```
+
+**API Server Implementation:**
+```typescript
+// api/routes/agent.ts
+import { Queue } from 'bullmq';
+
+const queue = new Queue('agent-tasks', { connection: redis });
+
+app.post('/api/agents/:id/message', async (c) => {
+  const { id } = c.req.param();
+  const { message } = await c.req.json();
+
+  // Add job to queue
+  const job = await queue.add('process-message', {
+    agentId: id,
+    message,
+    userId: c.get('userId'),
+  });
+
+  return c.json({ jobId: job.id, status: 'queued' });
+});
+
+app.get('/api/agents/:id/events', async (c) => {
+  const { id } = c.req.param();
+
+  // Subscribe to Redis pub/sub
+  return streamSSE(c, async (stream) => {
+    const sub = redis.duplicate();
+    await sub.subscribe(`agent:${id}:events`);
+
+    sub.on('message', (channel, message) => {
+      stream.writeSSE({ data: message });
+    });
+  });
+});
+```
+
+**Worker Implementation:**
+```typescript
+// worker/index.ts
+import { Worker } from 'bullmq';
+import { AgentPool } from '@shareai-lab/kode-sdk';
+import { PostgresStore } from './postgres-store';
+
+const store = new PostgresStore(pgPool);
+const pool = new AgentPool({ dependencies: { store, ... }, maxAgents: 50 });
+
+const worker = new Worker('agent-tasks', async (job) => {
+  const { agentId, message } = job.data;
+
+  // Get or resume agent
+  let agent = pool.get(agentId);
+  if (!agent) {
+    const exists = await store.exists(agentId);
+    agent = exists
+      ? await pool.resume(agentId, getConfig(job.data))
+      : await pool.create(agentId, getConfig(job.data));
+  }
+
+  // Process
+  await agent.send(message);
+
+  // Stream events to Redis
+  for await (const event of agent.subscribeProgress()) {
+    await redis.publish(`agent:${agentId}:events`, JSON.stringify(event));
+
+    if (event.type === 'done') break;
+  }
+}, { connection: redis });
+
+// Periodic cleanup: hibernate idle agents
+setInterval(async () => {
+  for (const agentId of pool.list()) {
+    const agent = pool.get(agentId);
+    if (agent && agent.idleTime > 60_000) {
+      await agent.persistInfo();
+      pool.delete(agentId);
+    }
+  }
+}, 30_000);
+```
+
+**PostgreSQL Store Implementation:**
+```sql
+-- Schema
+CREATE TABLE agents (
+  id TEXT PRIMARY KEY,
+  template_id TEXT NOT NULL,
+  created_at TIMESTAMPTZ DEFAULT NOW(),
+  updated_at TIMESTAMPTZ DEFAULT NOW()
+);
+
+CREATE TABLE agent_messages (
+  agent_id TEXT PRIMARY KEY REFERENCES agents(id),
+  messages JSONB NOT NULL,
+  updated_at TIMESTAMPTZ DEFAULT NOW()
+);
+
+CREATE TABLE agent_tool_records (
+  agent_id TEXT PRIMARY KEY REFERENCES agents(id),
+  records JSONB NOT NULL,
+  updated_at TIMESTAMPTZ DEFAULT NOW()
+);
+
+CREATE TABLE agent_events (
+  id BIGSERIAL PRIMARY KEY,
+  agent_id TEXT REFERENCES agents(id),
+  channel TEXT NOT NULL,
+  event JSONB NOT NULL,
+  created_at TIMESTAMPTZ DEFAULT NOW()
+);
+
+CREATE INDEX idx_agent_events_agent_channel ON agent_events(agent_id, channel, id);
+```
+
+---
+
+## Scenario 5: Serverless (Vercel/Lambda)
+
+**Characteristics:**
+- Request-scoped execution
+- Cold starts
+- Execution time limits (10s-300s)
+- No local persistence
+
+**Challenges:**
+1. **No File System**: JSONStore won't work
+2. **Timeout**: Long agent tasks may exceed limits
+3. **Cold Start**: Must load state quickly
+4. **Stateless**: No in-memory agent pool
+
+**Architecture:**
+```
+┌──────────────────────────────────────────────────────────────┐
+│                     Vercel/Lambda                            │
+│  ┌────────────────────────────────────────────────────────┐  │
+│  │                   API Function                         │  │
+│  │                                                        │  │
+│  │  Request → Load Agent → Execute Step → Persist → Response  │
+│  │                                                        │  │
+│  └──────────────────────────┬─────────────────────────────┘  │
+└─────────────────────────────┼────────────────────────────────┘
+                              │
+               ┌──────────────┼──────────────┐
+               │              │              │
+        ┌──────▼──────┐ ┌─────▼─────┐ ┌─────▼─────┐
+        │  Supabase   │ │  Upstash  │ │  Inngest  │
+        │ (Postgres)  │ │  (Redis)  │ │  (Queue)  │
+        └─────────────┘ └───────────┘ └───────────┘
+```
+
+**Implementation:**
+```typescript
+// app/api/agent/[id]/route.ts
+import { Agent, AgentConfig } from '@shareai-lab/kode-sdk';
+import { SupabaseStore } from '@/lib/supabase-store';
+
+const store = new SupabaseStore(supabaseClient);
+
+export async function POST(req: Request, { params }: { params: { id: string } }) {
+  const { message } = await req.json();
+  const agentId = params.id;
+
+  // 1. Load or create agent
+  const exists = await store.exists(agentId);
+  const agent = exists
+    ? await Agent.resume(agentId, config, { store, ... })
+    : await Agent.create({ ...config, agentId }, { store, ... });
+
+  // 2. Send message
+  await agent.send(message);
+
+  // 3. Run with timeout (leave buffer for response)
+  const timeoutMs = 25_000; // Vercel Pro = 30s
+  let result;
+
+  try {
+    result = await Promise.race([
+      agent.complete(),
+      new Promise((_, reject) =>
+        setTimeout(() => reject(new Error('Timeout')), timeoutMs)
+      ),
+    ]);
+  } catch (e) {
+    if (e.message === 'Timeout') {
+      // Agent still processing, queue for background
+      await inngest.send('agent/continue', { agentId });
+      return Response.json({ status: 'processing', agentId });
+    }
+    throw e;
+  }
+
+  return Response.json({ status: 'done', output: result.text });
+}
+```
+
+**For Long-running Tasks:**
+Use a queue service like Inngest:
+
+```typescript
+// inngest/functions/agent-continue.ts
+import { inngest } from '@/lib/inngest';
+
+export const agentContinue = inngest.createFunction(
+  { id: 'agent-continue' },
+  { event: 'agent/continue' },
+  async ({ event, step }) => {
+    const { agentId } = event.data;
+
+    // Resume agent
+    const agent = await Agent.resume(agentId, config, { store, ... });
+
+    // Process until done (Inngest handles timeouts)
+    const result = await step.run('complete', async () => {
+      return agent.complete();
+    });
+
+    // Notify user via webhook/push
+    await step.run('notify', async () => {
+      await notifyUser(agentId, result);
+    });
+
+    return result;
+  }
+);
+```
+
+---
+
+## Custom Store Implementations
+
+### PostgreSQL Store (Full Example)
+
+```typescript
+import { Store, Message, ToolCallRecord, Timeline, Bookmark, ... } from '@shareai-lab/kode-sdk';
+import { Pool } from 'pg';
+
+export class PostgresStore implements Store {
+  constructor(private pool: Pool) {}
+
+  // === Runtime State ===
+
+  async saveMessages(agentId: string, messages: Message[]): Promise<void> {
+    await this.pool.query(`
+      INSERT INTO agent_messages (agent_id, messages, updated_at)
+      VALUES ($1, $2, NOW())
+      ON CONFLICT (agent_id) DO UPDATE SET messages = $2, updated_at = NOW()
+    `, [agentId, JSON.stringify(messages)]);
+  }
+
+  async loadMessages(agentId: string): Promise<Message[]> {
+    const { rows } = await this.pool.query(
+      'SELECT messages FROM agent_messages WHERE agent_id = $1',
+      [agentId]
+    );
+    return rows[0]?.messages || [];
+  }
+
+  async saveToolCallRecords(agentId: string, records: ToolCallRecord[]): Promise<void> {
+    await this.pool.query(`
+      INSERT INTO agent_tool_records (agent_id, records, updated_at)
+      VALUES ($1, $2, NOW())
+      ON CONFLICT (agent_id) DO UPDATE SET records = $2, updated_at = NOW()
+    `, [agentId, JSON.stringify(records)]);
+  }
+
+  async loadToolCallRecords(agentId: string): Promise<ToolCallRecord[]> {
+    const { rows } = await this.pool.query(
+      'SELECT records FROM agent_tool_records WHERE agent_id = $1',
+      [agentId]
+    );
+    return rows[0]?.records || [];
+  }
+
+  // === Events ===
+
+  async appendEvent(agentId: string, timeline: Timeline): Promise<void> {
+    await this.pool.query(`
+      INSERT INTO agent_events (agent_id, channel, event, created_at)
+      VALUES ($1, $2, $3, NOW())
+    `, [agentId, timeline.event.channel, JSON.stringify(timeline)]);
+  }
+
+  async *readEvents(agentId: string, opts?: { since?: Bookmark; channel?: string }): AsyncIterable<Timeline> {
+    const conditions = ['agent_id = $1'];
+    const params: any[] = [agentId];
+    let paramIndex = 2;
+
+    if (opts?.since) {
+      conditions.push(`id > $${paramIndex++}`);
+      params.push(opts.since.seq);
+    }
+    if (opts?.channel) {
+      conditions.push(`channel = $${paramIndex++}`);
+      params.push(opts.channel);
+    }
+
+    const { rows } = await this.pool.query(`
+      SELECT event FROM agent_events
+      WHERE ${conditions.join(' AND ')}
+      ORDER BY id ASC
+    `, params);
+
+    for (const row of rows) {
+      yield row.event;
+    }
+  }
+
+  // ... implement remaining methods (history, snapshots, metadata, lifecycle)
+
+  async exists(agentId: string): Promise<boolean> {
+    const { rows } = await this.pool.query(
+      'SELECT 1 FROM agents WHERE id = $1',
+      [agentId]
+    );
+    return rows.length > 0;
+  }
+
+  async delete(agentId: string): Promise<void> {
+    await this.pool.query('DELETE FROM agents WHERE id = $1', [agentId]);
+  }
+
+  async list(prefix?: string): Promise<string[]> {
+    const { rows } = await this.pool.query(
+      prefix
+        ? 'SELECT id FROM agents WHERE id LIKE $1'
+        : 'SELECT id FROM agents',
+      prefix ? [`${prefix}%`] : []
+    );
+    return rows.map(r => r.id);
+  }
+}
+```
+
+---
+
+## Capacity Planning
+
+| Deployment | Agents/Process | Memory/Agent | Concurrent Users |
+|------------|----------------|--------------|------------------|
+| CLI | 1 | 10-100 MB | 1 |
+| Desktop | 5-10 | 50-200 MB | 1 |
+| Single Server | 50 | 2-10 MB | 50-100 |
+| Worker Cluster (10 nodes) | 500 | 2-10 MB | 500-1000 |
+| Worker Cluster (50 nodes) | 2500 | 2-10 MB | 2500-5000 |
+
+**Memory Estimation per Agent:**
+- Base object: ~50 KB
+- Message history (100 messages): ~500 KB - 5 MB
+- Tool records: ~50-500 KB
+- Event timeline: ~100 KB - 1 MB
+- **Typical total: 1-10 MB**
+
+---
+
+## Summary
+
+1. **CLI/Desktop/IDE**: Use JSONStore, single AgentPool, straightforward
+2. **Single Server**: Add HTTP layer, consider Redis for events
+3. **Multi-node**: Implement custom Store, use queue for job distribution
+4. **Serverless**: Use external DB, handle timeouts, consider background queue
+
+The key insight: **KODE SDK handles the agent lifecycle; you handle the infrastructure.**
diff --git a/docs/ROADMAP.md b/docs/ROADMAP.md
new file mode 100644
index 0000000..ba00487
--- /dev/null
+++ b/docs/ROADMAP.md
@@ -0,0 +1,303 @@
+# KODE SDK Roadmap
+
+> This document outlines the development roadmap for KODE SDK, based on actual current capabilities and planned enhancements.
+
+---
+
+## Current State (v2.7.0)
+
+### What Works Well
+
+| Feature | Status | Notes |
+|---------|--------|-------|
+| Agent State Machine | Stable | 7-stage breakpoint system |
+| JSONStore | Stable | WAL-protected file persistence |
+| Event System | Stable | 3 channels (Progress/Control/Monitor) |
+| Fork/Resume | Stable | Safe fork points, crash recovery |
+| Multi-provider | Stable | Anthropic, OpenAI, Gemini, DeepSeek, Qwen, GLM... |
+| Tool System | Stable | Built-in + MCP protocol |
+| AgentPool | Stable | Up to 50 agents per process |
+| Checkpointer | Stable | Memory, File, Redis implementations |
+| Context Compression | Stable | Automatic history management |
+| Hook System | Stable | Pre/post model and tool hooks |
+
+### Current Limitations
+
+| Limitation | Impact | Workaround |
+|------------|--------|------------|
+| JSONStore only | No database persistence | Implement custom Store |
+| Single-process pool | No distributed scaling | Build orchestration layer |
+| 5-min processing timeout | Not configurable | Fork SDK if needed |
+| No stateless mode | Serverless challenges | Request-scoped pattern |
+| No distributed locking | Multi-instance conflicts | External coordination |
+
+---
+
+## Short-term: v2.8 - v2.9 (Q1-Q2 2025)
+
+### v2.8: Store Improvements
+
+**Goal**: Make custom Store implementation easier and more robust.
+
+| Feature | Priority | Description |
+|---------|----------|-------------|
+| Store interface documentation | P0 | Comprehensive guide for implementing custom stores |
+| Store validation utilities | P1 | Test helpers to verify Store implementations |
+| Incremental message API | P1 | `appendMessage()` in addition to `saveMessages()` |
+| Store migration utilities | P2 | Tools for migrating data between Store implementations |
+
+**New APIs:**
+```typescript
+// Optional incremental methods (backwards compatible)
+interface Store {
+  // Existing methods...
+
+  // NEW: Incremental append (optional, for performance)
+  appendMessage?(agentId: string, message: Message): Promise<void>;
+
+  // NEW: Paginated loading (optional, for large histories)
+  loadMessagesPaginated?(agentId: string, opts: {
+    offset: number;
+    limit: number;
+  }): Promise<Message[]>;
+}
+```
+
+### v2.9: Configurable Limits
+
+**Goal**: Remove hard-coded limits, improve serverless compatibility.
+
+| Feature | Priority | Description |
+|---------|----------|-------------|
+| Configurable processing timeout | P0 | Currently hard-coded to 5 minutes |
+| Configurable tool buffer size | P1 | Currently hard-coded to 10 MB |
+| Pool size validation | P2 | Better error messages when exceeding limits |
+
+**New APIs:**
+```typescript
+// Agent configuration
+const agent = await Agent.create({
+  agentId: 'my-agent',
+  templateId: 'default',
+  // NEW: Runtime limits
+  limits: {
+    processingTimeout: 30_000,  // 30 seconds for serverless
+    toolBufferSize: 5 * 1024 * 1024,  // 5 MB
+  },
+}, dependencies);
+```
+
+---
+
+## Mid-term: v3.0 (Q3 2025)
+
+### v3.0: Official Store Implementations
+
+**Goal**: Provide production-ready Store implementations for common databases.
+
+| Store | Priority | Dependencies |
+|-------|----------|--------------|
+| `@kode-sdk/store-postgres` | P0 | `pg` |
+| `@kode-sdk/store-redis` | P0 | `ioredis` |
+| `@kode-sdk/store-supabase` | P1 | `@supabase/supabase-js` |
+| `@kode-sdk/store-dynamodb` | P2 | `@aws-sdk/client-dynamodb` |
+
+**Package Structure:**
+```
+@kode-sdk/store-postgres
+├── src/
+│   ├── index.ts
+│   ├── postgres-store.ts
+│   └── schema.sql
+├── README.md
+└── package.json
+```
+
+**Features:**
+- Complete Store interface implementation
+- Schema migration utilities
+- Connection pooling
+- Retry logic for transient failures
+- Distributed locking support
+
+### v3.0: Stateless Execution Mode
+
+**Goal**: Native support for serverless environments.
+
+```typescript
+// NEW: Request-scoped execution
+import { StatelessAgent } from '@shareai-lab/kode-sdk';
+
+export async function POST(req: Request) {
+  const { agentId, message } = await req.json();
+
+  // Automatically handles: load → execute → persist
+  const result = await StatelessAgent.run(agentId, message, {
+    store: postgresStore,
+    templateId: 'default',
+    timeout: 25_000,
+  });
+
+  return Response.json(result);
+}
+```
+
+**Key Features:**
+- Automatic state loading and persisting
+- Timeout handling with graceful shutdown
+- No in-memory pool required
+- Optimized for cold starts
+
+---
+
+## Long-term: v4.0+ (2026)
+
+### v4.0: Distributed Infrastructure (Optional Package)
+
+**Goal**: Official distributed coordination package for high-scale deployments.
+
+```
+@kode-sdk/distributed
+├── scheduler/      # Agent scheduling across workers
+├── discovery/      # Agent location discovery
+├── locking/        # Distributed locking
+└── migration/      # Agent migration between nodes
+```
+
+**Features:**
+```typescript
+import { DistributedPool } from '@kode-sdk/distributed';
+
+const pool = new DistributedPool({
+  store: postgresStore,
+  redis: redisClient,
+  workerId: process.env.WORKER_ID,
+  maxLocalAgents: 50,
+});
+
+// Agent automatically migrates between workers
+const agent = await pool.acquire(agentId);
+await agent.send(message);
+await agent.complete();
+await pool.release(agentId);
+```
+
+### v4.0: Observability Integration
+
+**Goal**: First-class observability support.
+
+```typescript
+import { OpenTelemetryPlugin } from '@kode-sdk/observability';
+
+const agent = await Agent.create({
+  agentId: 'my-agent',
+  templateId: 'default',
+  plugins: [
+    new OpenTelemetryPlugin({
+      serviceName: 'my-agent-service',
+      tracing: true,
+      metrics: true,
+    }),
+  ],
+}, dependencies);
+```
+
+**Metrics:**
+- `agent.step.duration` - Time per agent step
+- `agent.tool.duration` - Time per tool execution
+- `agent.model.tokens` - Token usage
+- `agent.errors` - Error count by type
+
+### v4.x: Additional Sandboxes
+
+| Sandbox | Status | Description |
+|---------|--------|-------------|
+| `DockerSandbox` | Planned | Run tools in Docker containers |
+| `K8sSandbox` | Planned | Run tools in Kubernetes pods |
+| `E2BSandbox` | Planned | Integration with E2B.dev |
+| `FirecrackerSandbox` | Exploring | MicroVM isolation |
+
+---
+
+## Community Contributions Welcome
+
+### High-Impact Contributions
+
+1. **Store Implementations**
+   - MongoDB Store
+   - SQLite Store (for embedded use)
+   - Turso/LibSQL Store
+
+2. **Sandbox Implementations**
+   - Docker Sandbox
+   - WebContainer Sandbox (browser)
+
+3. **Tool Integrations**
+   - Browser automation (Playwright)
+   - Database clients
+   - Cloud service SDKs
+
+### Contribution Guidelines
+
+See [CONTRIBUTING.md](./CONTRIBUTING.md) for:
+- Code style and testing requirements
+- Pull request process
+- Interface implementation guidelines
+
+---
+
+## Version Support Policy
+
+| Version | Status | Support Until |
+|---------|--------|---------------|
+| v2.7.x | Current | Active development |
+| v2.6.x | Maintenance | 6 months after v2.8 |
+| v2.5.x | End of Life | No longer supported |
+
+**Semver Policy:**
+- Major (v3, v4): Breaking changes to core APIs
+- Minor (v2.8, v2.9): New features, backwards compatible
+- Patch (v2.7.1): Bug fixes only
+
+---
+
+## Feedback & Prioritization
+
+Roadmap priorities are influenced by:
+
+1. **GitHub Issues**: Feature requests with most reactions
+2. **Community Discussions**: Patterns emerging from usage
+3. **Production Feedback**: Real-world deployment challenges
+
+To influence the roadmap:
+- Open or upvote GitHub Issues
+- Share your use case in Discussions
+- Contribute implementations for planned features
+
+---
+
+## Timeline Summary
+
+```
+2025 Q1-Q2: v2.8-v2.9
+├── Store interface improvements
+├── Configurable limits
+└── Better serverless support
+
+2025 Q3: v3.0
+├── Official Store packages (Postgres, Redis, Supabase)
+├── Stateless execution mode
+└── Improved documentation
+
+2026: v4.0+
+├── Distributed infrastructure package
+├── Observability integration
+└── Additional sandbox implementations
+```
+
+The roadmap focuses on:
+1. **Making extension easier** (v2.8-2.9)
+2. **Providing official implementations** (v3.0)
+3. **Scaling infrastructure** (v4.0+)
+
+Core philosophy remains: **KODE SDK is a runtime kernel, not a platform.** Official packages extend capabilities without bloating the core.
diff --git a/docs/scenarios/cli-tools.md b/docs/scenarios/cli-tools.md
new file mode 100644
index 0000000..ac1113a
--- /dev/null
+++ b/docs/scenarios/cli-tools.md
@@ -0,0 +1,352 @@
+# Scenario: CLI Agent Tools
+
+> Build command-line AI assistants like Claude Code, Cursor, or custom developer tools.
+
+---
+
+## Why CLI is Perfect for KODE SDK
+
+| Feature | Benefit |
+|---------|---------|
+| Single process | No distributed complexity |
+| Local filesystem | JSONStore works perfectly |
+| Long-running | Agent loops run naturally |
+| Single user | No multi-tenancy needed |
+
+**Compatibility: 100%** - This is KODE SDK's sweet spot.
+
+---
+
+## Quick Start: Minimal CLI Agent
+
+```typescript
+// cli-agent.ts
+import { Agent, AnthropicProvider, LocalSandbox } from '@anthropic/kode-sdk';
+import * as readline from 'readline';
+
+async function main() {
+  // Create agent with local persistence
+  const agent = await Agent.create({
+    agentId: 'cli-assistant',
+    template: {
+      systemPrompt: `You are a helpful CLI assistant.
+You can execute bash commands and help with file operations.`,
+    },
+    deps: {
+      modelProvider: new AnthropicProvider(process.env.ANTHROPIC_API_KEY!),
+      sandbox: new LocalSandbox({ workDir: process.cwd() }),
+    },
+  });
+
+  // Stream output to terminal
+  agent.subscribeProgress({ kinds: ['text_chunk'] }, (event) => {
+    process.stdout.write(event.text);
+  });
+
+  // Show tool execution
+  agent.subscribeProgress({ kinds: ['tool:start', 'tool:complete'] }, (event) => {
+    if (event.kind === 'tool:start') {
+      console.log(`\n[Running: ${event.name}]`);
+    }
+  });
+
+  // Interactive loop
+  const rl = readline.createInterface({
+    input: process.stdin,
+    output: process.stdout,
+  });
+
+  console.log('CLI Agent ready. Type your message (Ctrl+C to exit)\n');
+
+  const askQuestion = () => {
+    rl.question('You: ', async (input) => {
+      if (input.trim()) {
+        console.log('\nAssistant: ');
+        await agent.chat(input);
+        console.log('\n');
+      }
+      askQuestion();
+    });
+  };
+
+  askQuestion();
+}
+
+main().catch(console.error);
+```
+
+Run it:
+```bash
+npx ts-node cli-agent.ts
+```
+
+---
+
+## Production CLI: Resume & Persistence
+
+For a production CLI tool, you want:
+1. **Session persistence** - Resume conversations across runs
+2. **Crash recovery** - Don't lose progress
+3. **Multiple sessions** - Switch between contexts
+
+```typescript
+// production-cli.ts
+import { Agent, AgentPool, AnthropicProvider, LocalSandbox, JSONStore } from '@anthropic/kode-sdk';
+import * as path from 'path';
+import * as os from 'os';
+
+// Store data in user's home directory
+const DATA_DIR = path.join(os.homedir(), '.my-cli-agent');
+const store = new JSONStore(DATA_DIR);
+
+async function getOrCreateAgent(sessionId: string): Promise<Agent> {
+  const pool = new AgentPool({ store, maxAgents: 10 });
+
+  // Try to resume existing session
+  try {
+    const agent = await pool.resume(sessionId, {
+      template: { systemPrompt: '...' },
+    }, {
+      modelProvider: new AnthropicProvider(process.env.ANTHROPIC_API_KEY!),
+      sandbox: new LocalSandbox({ workDir: process.cwd() }),
+    });
+    console.log(`Resumed session: ${sessionId}`);
+    return agent;
+  } catch {
+    // Create new session
+    const agent = await pool.create(sessionId, {
+      template: { systemPrompt: '...' },
+    }, {
+      modelProvider: new AnthropicProvider(process.env.ANTHROPIC_API_KEY!),
+      sandbox: new LocalSandbox({ workDir: process.cwd() }),
+    });
+    console.log(`Created new session: ${sessionId}`);
+    return agent;
+  }
+}
+
+// Usage
+const sessionId = process.argv[2] || 'default';
+const agent = await getOrCreateAgent(sessionId);
+```
+
+---
+
+## Tool Approval Flow
+
+For dangerous operations, implement approval:
+
+```typescript
+import { PermissionMode } from '@anthropic/kode-sdk';
+
+const agent = await Agent.create({
+  agentId: 'safe-cli',
+  config: {
+    permission: {
+      mode: 'approval',  // Require approval for all tools
+      // Or custom mode:
+      // mode: 'custom',
+      // customMode: async (call, ctx) => {
+      //   if (call.name === 'bash_run') {
+      //     return { decision: 'ask' };  // Prompt user
+      //   }
+      //   return { decision: 'allow' };
+      // }
+    },
+  },
+  // ...
+});
+
+// Handle approval requests
+agent.subscribeControl((event) => {
+  if (event.kind === 'permission_required') {
+    console.log(`\nTool requires approval: ${event.toolName}`);
+    console.log(`Input: ${JSON.stringify(event.input, null, 2)}`);
+
+    const rl = readline.createInterface({
+      input: process.stdin,
+      output: process.stdout,
+    });
+
+    rl.question('Approve? (y/n): ', (answer) => {
+      agent.approveToolUse(event.callId, answer.toLowerCase() === 'y');
+      rl.close();
+    });
+  }
+});
+```
+
+---
+
+## Example: Developer Assistant CLI
+
+Complete example with file operations, git commands, and safety:
+
+```typescript
+// dev-assistant.ts
+import {
+  Agent,
+  AnthropicProvider,
+  LocalSandbox,
+  JSONStore,
+  defineSimpleTool,
+} from '@anthropic/kode-sdk';
+
+// Custom tools
+const gitStatusTool = defineSimpleTool({
+  name: 'git_status',
+  description: 'Check git repository status',
+  parameters: {},
+  execute: async () => {
+    const { execSync } = await import('child_process');
+    return execSync('git status --porcelain').toString();
+  },
+});
+
+const searchCodeTool = defineSimpleTool({
+  name: 'search_code',
+  description: 'Search for patterns in code files',
+  parameters: {
+    type: 'object',
+    properties: {
+      pattern: { type: 'string', description: 'Search pattern (regex)' },
+      fileType: { type: 'string', description: 'File extension (e.g., ts, js)' },
+    },
+    required: ['pattern'],
+  },
+  execute: async ({ pattern, fileType }) => {
+    const { execSync } = await import('child_process');
+    const glob = fileType ? `--include="*.${fileType}"` : '';
+    return execSync(`grep -r ${glob} "${pattern}" . 2>/dev/null || echo "No matches"`).toString();
+  },
+});
+
+async function main() {
+  const agent = await Agent.create({
+    agentId: 'dev-assistant',
+    template: {
+      systemPrompt: `You are a developer assistant.
+You help with:
+- Code navigation and search
+- Git operations
+- File management
+- Running tests and builds
+
+Always explain what you're doing before executing commands.
+Be cautious with destructive operations.`,
+      tools: [gitStatusTool, searchCodeTool],  // Add custom tools
+    },
+    config: {
+      permission: {
+        mode: 'auto',  // Auto-approve safe operations
+        autoApprove: ['git_status', 'search_code', 'file_read'],
+        requireApproval: ['bash_run', 'file_write', 'file_delete'],
+      },
+    },
+    deps: {
+      modelProvider: new AnthropicProvider(process.env.ANTHROPIC_API_KEY!, 'claude-sonnet-4-20250514'),
+      sandbox: new LocalSandbox({ workDir: process.cwd() }),
+      store: new JSONStore('./.dev-assistant'),
+    },
+  });
+
+  // ... rest of CLI implementation
+}
+```
+
+---
+
+## Best Practices for CLI Agents
+
+### 1. Progress Indication
+
+```typescript
+// Show spinner during model calls
+agent.subscribeMonitor((event) => {
+  if (event.kind === 'model_start') {
+    process.stdout.write('Thinking...');
+  }
+  if (event.kind === 'model_complete') {
+    process.stdout.write('\r          \r');  // Clear spinner
+  }
+});
+```
+
+### 2. Graceful Shutdown
+
+```typescript
+// Handle Ctrl+C
+process.on('SIGINT', async () => {
+  console.log('\nSaving session...');
+  await agent.persistInfo();
+  process.exit(0);
+});
+
+process.on('SIGTERM', async () => {
+  await agent.persistInfo();
+  process.exit(0);
+});
+```
+
+### 3. Token Usage Tracking
+
+```typescript
+let totalTokens = 0;
+
+agent.subscribeMonitor((event) => {
+  if (event.kind === 'token_usage') {
+    totalTokens += event.inputTokens + event.outputTokens;
+    // Show in status bar or on exit
+  }
+});
+
+process.on('exit', () => {
+  console.log(`\nTotal tokens used: ${totalTokens}`);
+});
+```
+
+### 4. History Navigation
+
+```typescript
+// Show conversation history on start
+const messages = await agent.getMessages();
+console.log(`Session has ${messages.length} messages`);
+
+// Allow user to see recent context
+if (messages.length > 0) {
+  const last = messages[messages.length - 1];
+  console.log(`Last message: ${last.role}: ${last.content.slice(0, 100)}...`);
+}
+```
+
+---
+
+## File Structure
+
+Recommended project structure for a CLI agent:
+
+```
+my-cli-agent/
+├── src/
+│   ├── index.ts          # Entry point
+│   ├── agent.ts          # Agent configuration
+│   ├── tools/            # Custom tools
+│   │   ├── git.ts
+│   │   ├── search.ts
+│   │   └── index.ts
+│   └── ui/               # Terminal UI
+│       ├── spinner.ts
+│       ├── prompt.ts
+│       └── colors.ts
+├── data/                 # Agent persistence (gitignored)
+├── package.json
+└── tsconfig.json
+```
+
+---
+
+## Next Steps
+
+- See [Tools Guide](../tools.md) for building custom tools
+- See [Events Guide](../events.md) for advanced event handling
+- See [Playbooks](../playbooks.md) for common patterns
diff --git a/docs/scenarios/desktop-apps.md b/docs/scenarios/desktop-apps.md
new file mode 100644
index 0000000..0c9e96b
--- /dev/null
+++ b/docs/scenarios/desktop-apps.md
@@ -0,0 +1,234 @@
+# Scenario: Desktop Applications
+
+> Build Electron/Tauri apps with embedded AI agents.
+
+---
+
+## Why Desktop is Perfect for KODE SDK
+
+| Feature | Benefit |
+|---------|---------|
+| Full filesystem access | JSONStore works natively |
+| Long-running process | Agent loops run without timeout |
+| Local resources | No network latency for persistence |
+| Single user | No multi-tenancy complexity |
+
+**Compatibility: 95%** - Minor adjustments for IPC.
+
+---
+
+## Electron Integration
+
+### Main Process Setup
+
+```typescript
+// main/agent-service.ts
+import { Agent, AgentPool, AnthropicProvider, LocalSandbox, JSONStore } from '@anthropic/kode-sdk';
+import { app, ipcMain } from 'electron';
+import * as path from 'path';
+
+// Store data in app's user data directory
+const DATA_DIR = path.join(app.getPath('userData'), 'agents');
+const store = new JSONStore(DATA_DIR);
+const pool = new AgentPool({ store, maxAgents: 20 });
+
+// Create agent
+ipcMain.handle('agent:create', async (event, { agentId, systemPrompt }) => {
+  const agent = await pool.create(agentId, {
+    template: { systemPrompt },
+  }, {
+    modelProvider: new AnthropicProvider(process.env.ANTHROPIC_API_KEY!),
+    sandbox: new LocalSandbox({ workDir: app.getPath('documents') }),
+  });
+
+  // Forward events to renderer
+  agent.subscribeProgress({ kinds: ['text_chunk', 'tool:start', 'tool:complete', 'done'] }, (event) => {
+    event.sender.send(`agent:progress:${agentId}`, event);
+  });
+
+  return { success: true, agentId };
+});
+
+// Send message
+ipcMain.handle('agent:chat', async (event, { agentId, message }) => {
+  const agent = pool.get(agentId);
+  if (!agent) throw new Error('Agent not found');
+
+  await agent.chat(message);
+  return { success: true };
+});
+
+// List agents
+ipcMain.handle('agent:list', async () => {
+  const agents = await store.listAgents();
+  return agents;
+});
+
+// Graceful shutdown
+app.on('before-quit', async (event) => {
+  event.preventDefault();
+  for (const [id, agent] of pool.agents) {
+    await agent.persistInfo();
+  }
+  app.quit();
+});
+```
+
+### Renderer Process (React)
+
+```typescript
+// renderer/hooks/useAgent.ts
+import { useState, useEffect, useCallback } from 'react';
+
+export function useAgent(agentId: string) {
+  const [messages, setMessages] = useState<Message[]>([]);
+  const [isProcessing, setIsProcessing] = useState(false);
+
+  useEffect(() => {
+    // Listen for progress events
+    const handler = (event: any, data: ProgressEvent) => {
+      if (data.kind === 'text_chunk') {
+        setMessages(prev => {
+          const last = prev[prev.length - 1];
+          if (last?.role === 'assistant') {
+            return [...prev.slice(0, -1), {
+              ...last,
+              content: last.content + data.text,
+            }];
+          }
+          return [...prev, { role: 'assistant', content: data.text }];
+        });
+      }
+      if (data.kind === 'done') {
+        setIsProcessing(false);
+      }
+    };
+
+    window.electron.on(`agent:progress:${agentId}`, handler);
+    return () => window.electron.off(`agent:progress:${agentId}`, handler);
+  }, [agentId]);
+
+  const sendMessage = useCallback(async (text: string) => {
+    setMessages(prev => [...prev, { role: 'user', content: text }]);
+    setIsProcessing(true);
+    await window.electron.invoke('agent:chat', { agentId, message: text });
+  }, [agentId]);
+
+  return { messages, isProcessing, sendMessage };
+}
+```
+
+---
+
+## Tauri Integration
+
+```rust
+// src-tauri/src/main.rs
+use tauri::Manager;
+
+#[tauri::command]
+async fn create_agent(app: tauri::AppHandle, agent_id: String) -> Result<(), String> {
+    // Use sidecar process for Node.js agent runtime
+    let sidecar = app.shell()
+        .sidecar("agent-runtime")
+        .expect("failed to create sidecar");
+
+    sidecar.spawn().expect("failed to spawn sidecar");
+    Ok(())
+}
+```
+
+```typescript
+// agent-runtime/index.ts (sidecar)
+// Same KODE SDK code as Electron main process
+// Communicate via Tauri's shell commands
+```
+
+---
+
+## Best Practices
+
+### 1. Data Directory
+
+```typescript
+// Cross-platform data directory
+import { app } from 'electron';
+
+const getDataDir = () => {
+  // macOS: ~/Library/Application Support/YourApp/agents
+  // Windows: %APPDATA%/YourApp/agents
+  // Linux: ~/.config/YourApp/agents
+  return path.join(app.getPath('userData'), 'agents');
+};
+```
+
+### 2. Workspace Integration
+
+```typescript
+// Let user choose workspace
+const workspace = await dialog.showOpenDialog({
+  properties: ['openDirectory'],
+  title: 'Select Agent Workspace',
+});
+
+const sandbox = new LocalSandbox({
+  workDir: workspace.filePaths[0],
+  allowedPaths: [workspace.filePaths[0]],  // Restrict to selected folder
+});
+```
+
+### 3. Auto-update Agents
+
+```typescript
+// On app update, migrate agent data if needed
+app.on('ready', async () => {
+  const version = app.getVersion();
+  const lastVersion = store.get('lastVersion');
+
+  if (lastVersion !== version) {
+    await migrateAgentData(lastVersion, version);
+    store.set('lastVersion', version);
+  }
+});
+```
+
+---
+
+## Example: AI Writing Assistant
+
+```typescript
+// Complete desktop writing assistant
+const writingAssistant = await Agent.create({
+  agentId: 'writing-assistant',
+  template: {
+    systemPrompt: `You are a writing assistant embedded in a desktop app.
+You help users write, edit, and improve their documents.
+You can read and write files in the user's workspace.`,
+    tools: [
+      // Custom tool to interact with the editor
+      defineSimpleTool({
+        name: 'insert_text',
+        description: 'Insert text at cursor position in the editor',
+        parameters: {
+          type: 'object',
+          properties: {
+            text: { type: 'string' },
+            position: { type: 'number' },
+          },
+          required: ['text'],
+        },
+        execute: async ({ text, position }) => {
+          // Send to renderer via IPC
+          mainWindow.webContents.send('editor:insert', { text, position });
+          return 'Text inserted';
+        },
+      }),
+    ],
+  },
+  // ...
+});
+```
+
+---
+
+See [CLI Tools Guide](./cli-tools.md) for more patterns that apply to desktop apps.
diff --git a/docs/scenarios/ide-plugins.md b/docs/scenarios/ide-plugins.md
new file mode 100644
index 0000000..d27719c
--- /dev/null
+++ b/docs/scenarios/ide-plugins.md
@@ -0,0 +1,359 @@
+# Scenario: IDE Plugins
+
+> Build VSCode, JetBrains, or other IDE extensions with AI coding assistants.
+
+---
+
+## Why IDE Plugins Work Well
+
+| Feature | Benefit |
+|---------|---------|
+| Extension host process | Long-running, like desktop |
+| workspace.fs API | File operations available |
+| Single user context | No multi-tenancy |
+| Rich UI integration | WebView for chat, decorations for highlights |
+
+**Compatibility: 85%** - Requires workspace.fs integration.
+
+---
+
+## VSCode Extension
+
+### Extension Activation
+
+```typescript
+// src/extension.ts
+import * as vscode from 'vscode';
+import { Agent, AgentPool, AnthropicProvider } from '@anthropic/kode-sdk';
+import { VSCodeSandbox } from './vscode-sandbox';
+import { VSCodeStore } from './vscode-store';
+
+let pool: AgentPool;
+
+export async function activate(context: vscode.ExtensionContext) {
+  // Store in extension's global storage
+  const store = new VSCodeStore(context.globalStorageUri);
+  pool = new AgentPool({ store, maxAgents: 5 });
+
+  // Register commands
+  context.subscriptions.push(
+    vscode.commands.registerCommand('myExtension.startChat', startChat),
+    vscode.commands.registerCommand('myExtension.explainCode', explainCode),
+    vscode.commands.registerCommand('myExtension.refactor', refactorCode),
+  );
+
+  // Create chat webview panel provider
+  context.subscriptions.push(
+    vscode.window.registerWebviewViewProvider('myExtension.chatView', new ChatViewProvider(pool))
+  );
+}
+
+export async function deactivate() {
+  // Save all agents before deactivation
+  for (const [id, agent] of pool.agents) {
+    await agent.persistInfo();
+  }
+}
+```
+
+### VSCode-Specific Sandbox
+
+```typescript
+// src/vscode-sandbox.ts
+import * as vscode from 'vscode';
+import { Sandbox, SandboxConfig } from '@anthropic/kode-sdk';
+
+export class VSCodeSandbox implements Sandbox {
+  private workspaceFolder: vscode.WorkspaceFolder;
+
+  constructor(workspaceFolder: vscode.WorkspaceFolder) {
+    this.workspaceFolder = workspaceFolder;
+  }
+
+  async readFile(relativePath: string): Promise<string> {
+    const uri = vscode.Uri.joinPath(this.workspaceFolder.uri, relativePath);
+    const content = await vscode.workspace.fs.readFile(uri);
+    return new TextDecoder().decode(content);
+  }
+
+  async writeFile(relativePath: string, content: string): Promise<void> {
+    const uri = vscode.Uri.joinPath(this.workspaceFolder.uri, relativePath);
+    await vscode.workspace.fs.writeFile(uri, new TextEncoder().encode(content));
+  }
+
+  async listFiles(pattern: string): Promise<string[]> {
+    const files = await vscode.workspace.findFiles(pattern);
+    return files.map(f => vscode.workspace.asRelativePath(f));
+  }
+
+  async executeCommand(command: string): Promise<{ stdout: string; stderr: string }> {
+    // Use VSCode's terminal API for command execution
+    const terminal = vscode.window.createTerminal({
+      name: 'Agent Command',
+      cwd: this.workspaceFolder.uri,
+    });
+
+    // Note: VSCode terminal doesn't return output directly
+    // Consider using child_process if extension has Node.js access
+    terminal.sendText(command);
+
+    return { stdout: 'Command sent to terminal', stderr: '' };
+  }
+}
+```
+
+### VSCode-Specific Store
+
+```typescript
+// src/vscode-store.ts
+import * as vscode from 'vscode';
+import { Store, Message, AgentInfo } from '@anthropic/kode-sdk';
+
+export class VSCodeStore implements Store {
+  constructor(private storageUri: vscode.Uri) {}
+
+  private getAgentUri(agentId: string, file: string): vscode.Uri {
+    return vscode.Uri.joinPath(this.storageUri, agentId, file);
+  }
+
+  async saveMessages(agentId: string, messages: Message[]): Promise<void> {
+    const uri = this.getAgentUri(agentId, 'messages.json');
+    const content = JSON.stringify(messages, null, 2);
+    await vscode.workspace.fs.writeFile(uri, new TextEncoder().encode(content));
+  }
+
+  async loadMessages(agentId: string): Promise<Message[]> {
+    try {
+      const uri = this.getAgentUri(agentId, 'messages.json');
+      const content = await vscode.workspace.fs.readFile(uri);
+      return JSON.parse(new TextDecoder().decode(content));
+    } catch {
+      return [];
+    }
+  }
+
+  // ... implement other Store methods
+}
+```
+
+### Chat WebView
+
+```typescript
+// src/chat-view-provider.ts
+import * as vscode from 'vscode';
+
+export class ChatViewProvider implements vscode.WebviewViewProvider {
+  constructor(private pool: AgentPool) {}
+
+  resolveWebviewView(webviewView: vscode.WebviewView) {
+    webviewView.webview.options = { enableScripts: true };
+    webviewView.webview.html = this.getHtmlContent();
+
+    // Handle messages from webview
+    webviewView.webview.onDidReceiveMessage(async (message) => {
+      if (message.type === 'chat') {
+        const agent = await this.getOrCreateAgent();
+
+        // Stream responses to webview
+        agent.subscribeProgress({ kinds: ['text_chunk'] }, (event) => {
+          webviewView.webview.postMessage({
+            type: 'text_chunk',
+            text: event.text,
+          });
+        });
+
+        await agent.chat(message.text);
+
+        webviewView.webview.postMessage({ type: 'done' });
+      }
+    });
+  }
+
+  private getHtmlContent(): string {
+    return `<!DOCTYPE html>
+    <html>
+      <head>
+        <style>
+          /* Chat UI styles */
+        </style>
+      </head>
+      <body>
+        <div id="chat-container"></div>
+        <input id="input" type="text" placeholder="Ask about the code...">
+        <script>
+          const vscode = acquireVsCodeApi();
+
+          document.getElementById('input').addEventListener('keypress', (e) => {
+            if (e.key === 'Enter') {
+              vscode.postMessage({ type: 'chat', text: e.target.value });
+              e.target.value = '';
+            }
+          });
+
+          window.addEventListener('message', (e) => {
+            if (e.data.type === 'text_chunk') {
+              // Append to chat
+            }
+          });
+        </script>
+      </body>
+    </html>`;
+  }
+}
+```
+
+---
+
+## Context-Aware Coding Assistant
+
+```typescript
+// Provide code context to the agent
+async function explainCode() {
+  const editor = vscode.window.activeTextEditor;
+  if (!editor) return;
+
+  const selection = editor.selection;
+  const selectedText = editor.document.getText(selection);
+  const fileName = editor.document.fileName;
+  const languageId = editor.document.languageId;
+
+  const agent = await getOrCreateAgent();
+
+  // Include file context
+  const context = `
+File: ${fileName}
+Language: ${languageId}
+
+Selected code:
+\`\`\`${languageId}
+${selectedText}
+\`\`\`
+`;
+
+  await agent.chat(`Explain this code:\n${context}`);
+}
+
+// Inline code actions
+async function refactorCode() {
+  const editor = vscode.window.activeTextEditor;
+  if (!editor) return;
+
+  const selection = editor.selection;
+  const selectedText = editor.document.getText(selection);
+
+  const agent = await getOrCreateAgent();
+
+  // Custom tool to apply edits
+  agent.registerTool({
+    name: 'apply_edit',
+    description: 'Replace the selected code with improved version',
+    parameters: {
+      type: 'object',
+      properties: {
+        newCode: { type: 'string', description: 'The refactored code' },
+      },
+      required: ['newCode'],
+    },
+    execute: async ({ newCode }) => {
+      await editor.edit(editBuilder => {
+        editBuilder.replace(selection, newCode);
+      });
+      return 'Code replaced successfully';
+    },
+  });
+
+  await agent.chat(`Refactor this code to be cleaner and more efficient:\n${selectedText}`);
+}
+```
+
+---
+
+## JetBrains Plugin (Kotlin)
+
+```kotlin
+// For JetBrains, run KODE SDK as a sidecar Node.js process
+// and communicate via JSON-RPC or WebSocket
+
+class AgentService(private val project: Project) {
+    private var process: Process? = null
+
+    fun start() {
+        val nodeScript = PluginUtil.getPluginPath() + "/agent-runtime/index.js"
+        process = ProcessBuilder("node", nodeScript)
+            .directory(File(project.basePath))
+            .start()
+
+        // Read output
+        thread {
+            process?.inputStream?.bufferedReader()?.forEachLine { line ->
+                handleAgentOutput(line)
+            }
+        }
+    }
+
+    fun sendMessage(message: String) {
+        process?.outputStream?.let {
+            it.write("$message\n".toByteArray())
+            it.flush()
+        }
+    }
+}
+```
+
+---
+
+## Best Practices for IDE Plugins
+
+### 1. Workspace-Scoped Agents
+
+```typescript
+// One agent per workspace
+function getAgentId(workspaceFolder: vscode.WorkspaceFolder): string {
+  return `workspace-${hashString(workspaceFolder.uri.toString())}`;
+}
+```
+
+### 2. Respect User Settings
+
+```typescript
+const config = vscode.workspace.getConfiguration('myExtension');
+const apiKey = config.get<string>('apiKey');
+const modelId = config.get<string>('model') || 'claude-sonnet-4-20250514';
+```
+
+### 3. Progress Indication
+
+```typescript
+await vscode.window.withProgress({
+  location: vscode.ProgressLocation.Notification,
+  title: 'AI Assistant',
+  cancellable: true,
+}, async (progress, token) => {
+  token.onCancellationRequested(() => {
+    agent.stop();
+  });
+
+  progress.report({ message: 'Thinking...' });
+  await agent.chat(message);
+});
+```
+
+### 4. Diagnostic Integration
+
+```typescript
+// Show agent suggestions as diagnostics
+const diagnosticCollection = vscode.languages.createDiagnosticCollection('ai-suggestions');
+
+agent.subscribeProgress({ kinds: ['suggestion'] }, (event) => {
+  const diagnostic = new vscode.Diagnostic(
+    new vscode.Range(event.line, 0, event.line, 100),
+    event.message,
+    vscode.DiagnosticSeverity.Information
+  );
+  diagnosticCollection.set(editor.document.uri, [diagnostic]);
+});
+```
+
+---
+
+See [Desktop Apps Guide](./desktop-apps.md) for more patterns that apply to IDE plugins.
diff --git a/docs/scenarios/large-scale-toc.md b/docs/scenarios/large-scale-toc.md
new file mode 100644
index 0000000..3470b5e
--- /dev/null
+++ b/docs/scenarios/large-scale-toc.md
@@ -0,0 +1,749 @@
+# Scenario: Large-Scale ToC Applications
+
+> Build ChatGPT/Manus-like applications serving thousands of concurrent users with hundreds of agents each.
+
+---
+
+## The Challenge
+
+Building a consumer-facing AI application at scale requires solving:
+
+| Challenge | Description |
+|-----------|-------------|
+| **High Concurrency** | 10K+ users, each with multiple agents |
+| **Agent Hibernation** | Inactive agents must sleep to save resources |
+| **Crash Recovery** | Server restart must restore all running agents |
+| **Fork Exploration** | Users fork agents to explore different paths |
+| **Multi-Machine** | Scale horizontally across servers |
+| **Serverless Frontend** | Deploy UI on Vercel/Cloudflare |
+
+**Direct KODE SDK Usage: Not Suitable**
+
+KODE SDK is designed as a runtime kernel, not a distributed platform. For large-scale ToC, you need the **Worker Microservice Pattern**.
+
+---
+
+## Recommended Architecture
+
+```
++------------------------------------------------------------------+
+|                        User Devices                               |
++------------------------------------------------------------------+
+                               |
+                               v
++------------------------------------------------------------------+
+|                    CDN / Edge (Cloudflare)                       |
++------------------------------------------------------------------+
+                               |
+                               v
++------------------------------------------------------------------+
+|                 API Gateway (Vercel/Cloudflare)                  |
+|                                                                   |
+|   /api/agents        - List user's agents                        |
+|   /api/agents/:id    - Get agent status                          |
+|   /api/chat          - Send message (enqueue)                    |
+|   /api/fork          - Fork agent (enqueue)                      |
+|   /api/stream/:id    - SSE stream (from Redis)                   |
++------------------------------------------------------------------+
+                               |
+                               v
++------------------------------------------------------------------+
+|                    Message Queue (Upstash Redis)                 |
+|                                                                   |
+|   Queue: agent:messages      - Chat messages                     |
+|   Queue: agent:commands      - Fork, hibernate, resume           |
+|   PubSub: agent:events:{id}  - Real-time events                  |
++------------------------------------------------------------------+
+                               |
+              +----------------+----------------+
+              |                |                |
+              v                v                v
++------------------+  +------------------+  +------------------+
+|   Worker Pool 1  |  |   Worker Pool 2  |  |   Worker Pool N  |
+|   (Railway)      |  |   (Railway)      |  |   (Railway)      |
+|                  |  |                  |  |                  |
+|   +----------+   |  |   +----------+   |  |   +----------+   |
+|   | KODE SDK |   |  |   | KODE SDK |   |  |   | KODE SDK |   |
+|   | Scheduler|   |  |   | Scheduler|   |  |   | Scheduler|   |
+|   +----------+   |  |   +----------+   |  |   +----------+   |
+|                  |  |                  |  |                  |
+|   Agents: 0-999  |  |  Agents: 1K-2K   |  |  Agents: 2K-3K   |
++--------+---------+  +--------+---------+  +--------+---------+
+         |                     |                     |
+         +---------------------+---------------------+
+                               |
+                               v
++------------------------------------------------------------------+
+|                    Distributed Store                              |
+|                                                                   |
+|   PostgreSQL (Supabase)     - Agent state, messages, metadata    |
+|   Redis Cluster             - Locks, sessions, hot cache         |
+|   S3/R2                     - File attachments, archives         |
++------------------------------------------------------------------+
+```
+
+---
+
+## Component Implementation
+
+### 1. API Layer (Serverless)
+
+```typescript
+// app/api/chat/route.ts (Next.js App Router)
+import { NextRequest } from 'next/server';
+import { Redis } from '@upstash/redis';
+import { createClient } from '@supabase/supabase-js';
+
+const redis = new Redis({ url: process.env.UPSTASH_URL!, token: process.env.UPSTASH_TOKEN! });
+const supabase = createClient(process.env.SUPABASE_URL!, process.env.SUPABASE_KEY!);
+
+export async function POST(req: NextRequest) {
+  // 1. Authenticate user
+  const user = await authenticateUser(req);
+  if (!user) {
+    return Response.json({ error: 'Unauthorized' }, { status: 401 });
+  }
+
+  // 2. Parse request
+  const { agentId, message } = await req.json();
+
+  // 3. Verify agent ownership
+  const { data: agent } = await supabase
+    .from('agents')
+    .select('id, user_id, state')
+    .eq('id', agentId)
+    .single();
+
+  if (!agent || agent.user_id !== user.id) {
+    return Response.json({ error: 'Agent not found' }, { status: 404 });
+  }
+
+  // 4. Create task and enqueue
+  const taskId = crypto.randomUUID();
+
+  await redis.lpush('agent:messages', JSON.stringify({
+    taskId,
+    agentId,
+    userId: user.id,
+    message,
+    timestamp: Date.now(),
+  }));
+
+  // 5. Update agent state
+  await supabase
+    .from('agents')
+    .update({ state: 'QUEUED', last_activity: new Date() })
+    .eq('id', agentId);
+
+  // 6. Return task ID for polling/streaming
+  return Response.json({
+    taskId,
+    streamUrl: `/api/stream/${taskId}`,
+  });
+}
+```
+
+### 2. SSE Stream Endpoint
+
+```typescript
+// app/api/stream/[taskId]/route.ts
+import { Redis } from '@upstash/redis';
+
+const redis = new Redis({ url: process.env.UPSTASH_URL!, token: process.env.UPSTASH_TOKEN! });
+
+export async function GET(
+  req: NextRequest,
+  { params }: { params: { taskId: string } }
+) {
+  const { taskId } = params;
+
+  // Create SSE stream
+  const stream = new ReadableStream({
+    async start(controller) {
+      const encoder = new TextEncoder();
+
+      // Subscribe to Redis PubSub
+      const subscriber = redis.duplicate();
+      await subscriber.subscribe(`task:${taskId}:events`, (message) => {
+        const event = JSON.parse(message);
+        controller.enqueue(encoder.encode(`data: ${JSON.stringify(event)}\n\n`));
+
+        if (event.kind === 'done' || event.kind === 'error') {
+          controller.close();
+          subscriber.unsubscribe();
+        }
+      });
+
+      // Timeout after 5 minutes
+      setTimeout(() => {
+        controller.close();
+        subscriber.unsubscribe();
+      }, 5 * 60 * 1000);
+    },
+  });
+
+  return new Response(stream, {
+    headers: {
+      'Content-Type': 'text/event-stream',
+      'Cache-Control': 'no-cache',
+      'Connection': 'keep-alive',
+    },
+  });
+}
+```
+
+### 3. Worker Service
+
+```typescript
+// worker/index.ts
+import { Agent, AgentPool } from '@anthropic/kode-sdk';
+import { Redis } from 'ioredis';
+import { PostgresStore } from './postgres-store';
+import { AgentScheduler } from './scheduler';
+
+const redis = new Redis(process.env.REDIS_URL!);
+const store = new PostgresStore(process.env.DATABASE_URL!);
+
+// Scheduler manages agent lifecycle
+const scheduler = new AgentScheduler({
+  maxActiveAgents: 100,  // Per worker
+  idleTimeout: 5 * 60 * 1000,  // 5 minutes
+  store,
+});
+
+// Process message queue
+async function processMessages() {
+  while (true) {
+    // Blocking pop from queue
+    const result = await redis.brpop('agent:messages', 30);
+
+    if (!result) continue;
+
+    const task = JSON.parse(result[1]);
+
+    try {
+      // Get or resume agent
+      const agent = await scheduler.getOrResume(task.agentId);
+
+      // Subscribe to events and forward to Redis PubSub
+      agent.subscribeProgress({ kinds: ['text_chunk', 'tool:start', 'tool:complete', 'done'] }, async (event) => {
+        await redis.publish(`task:${task.taskId}:events`, JSON.stringify(event));
+      });
+
+      // Process message
+      await agent.chat(task.message);
+
+      // Publish completion
+      await redis.publish(`task:${task.taskId}:events`, JSON.stringify({
+        kind: 'done',
+        taskId: task.taskId,
+      }));
+
+    } catch (error) {
+      // Publish error
+      await redis.publish(`task:${task.taskId}:events`, JSON.stringify({
+        kind: 'error',
+        taskId: task.taskId,
+        error: error.message,
+      }));
+    }
+  }
+}
+
+// Start worker
+processMessages().catch(console.error);
+```
+
+### 4. Agent Scheduler
+
+```typescript
+// worker/scheduler.ts
+import { Agent } from '@anthropic/kode-sdk';
+import { LRUCache } from 'lru-cache';
+
+export class AgentScheduler {
+  private active: LRUCache<string, Agent>;
+  private store: PostgresStore;
+  private config: SchedulerConfig;
+
+  constructor(config: SchedulerConfig) {
+    this.config = config;
+    this.store = config.store;
+
+    this.active = new LRUCache({
+      max: config.maxActiveAgents,
+      dispose: (agent, agentId) => {
+        // Auto-hibernate when evicted
+        this.hibernate(agentId, agent);
+      },
+      ttl: config.idleTimeout,
+    });
+  }
+
+  async getOrResume(agentId: string): Promise<Agent> {
+    // Check active cache
+    if (this.active.has(agentId)) {
+      return this.active.get(agentId)!;
+    }
+
+    // Acquire distributed lock
+    const lockId = await this.store.acquireLock(agentId, 60000);
+    if (!lockId) {
+      throw new Error('Agent is being processed by another worker');
+    }
+
+    try {
+      // Resume from database
+      const agent = await Agent.resume(agentId, this.getConfig(agentId), this.getDeps());
+
+      // Cache in active pool
+      this.active.set(agentId, agent);
+
+      // Setup idle tracking
+      agent.onIdle(() => {
+        this.active.delete(agentId);  // Triggers dispose -> hibernate
+      });
+
+      return agent;
+
+    } finally {
+      await this.store.releaseLock(agentId, lockId);
+    }
+  }
+
+  private async hibernate(agentId: string, agent: Agent): Promise<void> {
+    try {
+      await agent.persistInfo();
+      await this.store.updateAgentState(agentId, 'HIBERNATED');
+      console.log(`Hibernated agent: ${agentId}`);
+    } catch (error) {
+      console.error(`Failed to hibernate ${agentId}:`, error);
+    }
+  }
+}
+```
+
+### 5. PostgreSQL Store
+
+```typescript
+// worker/postgres-store.ts
+import { Pool } from 'pg';
+import { Store, Message, AgentInfo, ToolCallRecord } from '@anthropic/kode-sdk';
+
+export class PostgresStore implements Store {
+  private pool: Pool;
+
+  constructor(connectionString: string) {
+    this.pool = new Pool({
+      connectionString,
+      max: 20,
+      idleTimeoutMillis: 30000,
+    });
+  }
+
+  // Distributed lock using PostgreSQL Advisory Locks
+  async acquireLock(agentId: string, ttlMs: number): Promise<string | null> {
+    const lockKey = this.hashAgentId(agentId);
+    const client = await this.pool.connect();
+
+    try {
+      const result = await client.query(
+        'SELECT pg_try_advisory_lock($1) as acquired',
+        [lockKey]
+      );
+
+      if (result.rows[0].acquired) {
+        const lockId = crypto.randomUUID();
+
+        // Set expiry using a separate table
+        await client.query(
+          `INSERT INTO agent_locks (agent_id, lock_id, expires_at)
+           VALUES ($1, $2, NOW() + interval '${ttlMs} milliseconds')
+           ON CONFLICT (agent_id) DO UPDATE SET lock_id = $2, expires_at = NOW() + interval '${ttlMs} milliseconds'`,
+          [agentId, lockId]
+        );
+
+        return lockId;
+      }
+
+      return null;
+    } finally {
+      client.release();
+    }
+  }
+
+  async releaseLock(agentId: string, lockId: string): Promise<void> {
+    const lockKey = this.hashAgentId(agentId);
+    const client = await this.pool.connect();
+
+    try {
+      // Verify lock ownership
+      const result = await client.query(
+        'SELECT lock_id FROM agent_locks WHERE agent_id = $1',
+        [agentId]
+      );
+
+      if (result.rows[0]?.lock_id === lockId) {
+        await client.query('SELECT pg_advisory_unlock($1)', [lockKey]);
+        await client.query('DELETE FROM agent_locks WHERE agent_id = $1', [agentId]);
+      }
+    } finally {
+      client.release();
+    }
+  }
+
+  // Messages stored as JSONB
+  async saveMessages(agentId: string, messages: Message[]): Promise<void> {
+    await this.pool.query(
+      `INSERT INTO agent_messages (agent_id, messages, updated_at)
+       VALUES ($1, $2, NOW())
+       ON CONFLICT (agent_id) DO UPDATE SET messages = $2, updated_at = NOW()`,
+      [agentId, JSON.stringify(messages)]
+    );
+  }
+
+  async loadMessages(agentId: string): Promise<Message[]> {
+    const result = await this.pool.query(
+      'SELECT messages FROM agent_messages WHERE agent_id = $1',
+      [agentId]
+    );
+    return result.rows[0]?.messages || [];
+  }
+
+  // ... implement other Store methods
+}
+```
+
+---
+
+## Database Schema
+
+```sql
+-- Agents table
+CREATE TABLE agents (
+  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+  user_id UUID NOT NULL REFERENCES users(id),
+  name TEXT NOT NULL,
+  template_id TEXT NOT NULL,
+  state TEXT DEFAULT 'READY',
+  config JSONB DEFAULT '{}',
+  created_at TIMESTAMPTZ DEFAULT NOW(),
+  updated_at TIMESTAMPTZ DEFAULT NOW(),
+  last_activity TIMESTAMPTZ DEFAULT NOW()
+);
+
+CREATE INDEX idx_agents_user ON agents(user_id);
+CREATE INDEX idx_agents_state ON agents(state);
+
+-- Messages (one row per agent, JSONB array)
+CREATE TABLE agent_messages (
+  agent_id UUID PRIMARY KEY REFERENCES agents(id) ON DELETE CASCADE,
+  messages JSONB NOT NULL DEFAULT '[]',
+  version INTEGER DEFAULT 1,
+  updated_at TIMESTAMPTZ DEFAULT NOW()
+);
+
+-- Tool call records
+CREATE TABLE tool_calls (
+  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+  agent_id UUID NOT NULL REFERENCES agents(id) ON DELETE CASCADE,
+  name TEXT NOT NULL,
+  input JSONB,
+  result JSONB,
+  state TEXT DEFAULT 'PENDING',
+  started_at TIMESTAMPTZ,
+  completed_at TIMESTAMPTZ
+);
+
+CREATE INDEX idx_tool_calls_agent ON tool_calls(agent_id);
+
+-- Checkpoints (for fork)
+CREATE TABLE checkpoints (
+  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+  agent_id UUID NOT NULL REFERENCES agents(id) ON DELETE CASCADE,
+  parent_checkpoint_id UUID REFERENCES checkpoints(id),
+  snapshot JSONB NOT NULL,
+  tags TEXT[] DEFAULT '{}',
+  created_at TIMESTAMPTZ DEFAULT NOW()
+);
+
+CREATE INDEX idx_checkpoints_agent ON checkpoints(agent_id);
+
+-- Distributed locks
+CREATE TABLE agent_locks (
+  agent_id UUID PRIMARY KEY,
+  lock_id UUID NOT NULL,
+  worker_id TEXT,
+  expires_at TIMESTAMPTZ NOT NULL
+);
+
+-- Row Level Security
+ALTER TABLE agents ENABLE ROW LEVEL SECURITY;
+
+CREATE POLICY "Users can only access own agents" ON agents
+  FOR ALL USING (auth.uid() = user_id);
+
+ALTER TABLE agent_messages ENABLE ROW LEVEL SECURITY;
+
+CREATE POLICY "Users can only access own agent messages" ON agent_messages
+  FOR ALL USING (
+    agent_id IN (SELECT id FROM agents WHERE user_id = auth.uid())
+  );
+```
+
+---
+
+## Handling Special Scenarios
+
+### Agent Hibernation (Inactive Users)
+
+```typescript
+// Cron job: Check for idle agents every 5 minutes
+async function hibernateIdleAgents() {
+  const idleThreshold = new Date(Date.now() - 30 * 60 * 1000);  // 30 minutes
+
+  const { data: idleAgents } = await supabase
+    .from('agents')
+    .select('id')
+    .eq('state', 'ACTIVE')
+    .lt('last_activity', idleThreshold.toISOString());
+
+  for (const agent of idleAgents || []) {
+    await redis.lpush('agent:commands', JSON.stringify({
+      command: 'hibernate',
+      agentId: agent.id,
+    }));
+  }
+}
+```
+
+### Server Crash Recovery
+
+```typescript
+// On worker startup
+async function recoverFromCrash() {
+  // Find agents that were being processed by this worker
+  const { data: orphanedAgents } = await supabase
+    .from('agents')
+    .select('id')
+    .eq('state', 'PROCESSING')
+    .eq('worker_id', WORKER_ID);
+
+  for (const agent of orphanedAgents || []) {
+    console.log(`Recovering agent: ${agent.id}`);
+
+    // Resume with crash strategy
+    const recovered = await Agent.resume(agent.id, config, deps, {
+      strategy: 'crash',  // Auto-seal incomplete tool calls
+    });
+
+    // Re-queue for processing
+    await redis.lpush('agent:messages', JSON.stringify({
+      taskId: `recovery-${agent.id}`,
+      agentId: agent.id,
+      message: null,  // No new message, just recover
+      isRecovery: true,
+    }));
+  }
+}
+
+// Call on startup
+recoverFromCrash();
+```
+
+### Fork Multiple Agents
+
+```typescript
+// API endpoint for forking
+export async function POST(req: NextRequest) {
+  const { agentId, checkpointId, count = 1 } = await req.json();
+
+  // Validate: max 10 forks at once
+  if (count > 10) {
+    return Response.json({ error: 'Max 10 forks at once' }, { status: 400 });
+  }
+
+  const forkIds: string[] = [];
+
+  for (let i = 0; i < count; i++) {
+    const forkId = `${agentId}-fork-${Date.now()}-${i}`;
+
+    await redis.lpush('agent:commands', JSON.stringify({
+      command: 'fork',
+      agentId,
+      checkpointId,
+      forkId,
+    }));
+
+    forkIds.push(forkId);
+  }
+
+  return Response.json({ forkIds });
+}
+
+// Worker handles fork command
+async function handleForkCommand(command: ForkCommand) {
+  const parent = await scheduler.getOrResume(command.agentId);
+
+  const forked = await parent.fork(command.checkpointId);
+
+  // Store forked agent
+  await supabase.from('agents').insert({
+    id: command.forkId,
+    user_id: parent.userId,
+    parent_agent_id: command.agentId,
+    // ... copy other fields
+  });
+}
+```
+
+### Membership Expiry
+
+```typescript
+// Webhook from payment provider
+export async function POST(req: NextRequest) {
+  const event = await req.json();
+
+  if (event.type === 'subscription.cancelled') {
+    const userId = event.data.user_id;
+
+    // Pause all user's agents
+    await supabase
+      .from('agents')
+      .update({ state: 'MEMBERSHIP_PAUSED' })
+      .eq('user_id', userId);
+
+    // Hibernate any active agents
+    const { data: activeAgents } = await supabase
+      .from('agents')
+      .select('id')
+      .eq('user_id', userId)
+      .eq('state', 'ACTIVE');
+
+    for (const agent of activeAgents || []) {
+      await redis.lpush('agent:commands', JSON.stringify({
+        command: 'hibernate',
+        agentId: agent.id,
+        reason: 'membership_expired',
+      }));
+    }
+  }
+
+  if (event.type === 'subscription.renewed') {
+    const userId = event.data.user_id;
+
+    // Unpause all agents
+    await supabase
+      .from('agents')
+      .update({ state: 'HIBERNATED' })
+      .eq('user_id', userId)
+      .eq('state', 'MEMBERSHIP_PAUSED');
+  }
+}
+```
+
+---
+
+## Performance Considerations
+
+### Message Storage Optimization
+
+```typescript
+// Instead of storing full messages array
+// Use append-only log with periodic compaction
+
+class OptimizedMessageStore {
+  async appendMessage(agentId: string, message: Message) {
+    // Append to log table (fast)
+    await this.pool.query(
+      `INSERT INTO message_log (agent_id, seq, message)
+       VALUES ($1, nextval('message_seq'), $2)`,
+      [agentId, JSON.stringify(message)]
+    );
+
+    // Increment message count
+    await this.pool.query(
+      `UPDATE agents SET message_count = message_count + 1 WHERE id = $1`,
+      [agentId]
+    );
+  }
+
+  async loadMessages(agentId: string, limit = 100): Promise<Message[]> {
+    // Load latest messages (pagination)
+    const result = await this.pool.query(
+      `SELECT message FROM message_log
+       WHERE agent_id = $1
+       ORDER BY seq DESC
+       LIMIT $2`,
+      [agentId, limit]
+    );
+
+    return result.rows.reverse().map(r => r.message);
+  }
+}
+```
+
+### Fork Optimization (Copy-on-Write)
+
+```typescript
+// Fork without copying all messages
+async function forkAgentCOW(agentId: string, checkpointId: string): Promise<string> {
+  const forkId = generateForkId();
+
+  // Copy only metadata, reference same message log
+  await this.pool.query(
+    `INSERT INTO agents (id, user_id, template_id, config, fork_base_checkpoint_id)
+     SELECT $1, user_id, template_id, config, $2
+     FROM agents WHERE id = $3`,
+    [forkId, checkpointId, agentId]
+  );
+
+  // New messages go to fork's own log
+  // Old messages read from checkpoint reference
+
+  return forkId;
+}
+```
+
+---
+
+## Deployment Checklist
+
+- [ ] API layer deployed to Vercel/Cloudflare
+- [ ] Workers deployed to Railway/Render/Fly.io
+- [ ] PostgreSQL (Supabase) configured with RLS
+- [ ] Redis (Upstash) for queues and pub/sub
+- [ ] S3/R2 for file attachments
+- [ ] Monitoring (Sentry, DataDog, etc.)
+- [ ] Rate limiting configured
+- [ ] Graceful shutdown handlers
+- [ ] Health check endpoints
+- [ ] Auto-scaling rules for workers
+
+---
+
+## Cost Estimation
+
+| Component | ~10K Users | ~100K Users |
+|-----------|------------|-------------|
+| Vercel (API) | $20/mo | $100/mo |
+| Railway (Workers) | $50/mo | $500/mo |
+| Supabase (PostgreSQL) | $25/mo | $100/mo |
+| Upstash (Redis) | $10/mo | $50/mo |
+| **Total** | **~$100/mo** | **~$750/mo** |
+
+*Excludes LLM API costs*
+
+---
+
+## Summary
+
+Building a large-scale ToC application with KODE SDK requires:
+
+1. **Separate concerns**: Stateless API + Stateful workers
+2. **Queue-based communication**: Decouple request handling from agent execution
+3. **Distributed store**: PostgreSQL for persistence, Redis for real-time
+4. **Agent scheduling**: LRU cache for active agents, hibernate inactive
+5. **Crash recovery**: WAL + checkpoint for resilience
+
+KODE SDK provides the agent runtime kernel. You build the platform around it.
diff --git a/docs/scenarios/web-backend.md b/docs/scenarios/web-backend.md
new file mode 100644
index 0000000..3b81000
--- /dev/null
+++ b/docs/scenarios/web-backend.md
@@ -0,0 +1,444 @@
+# Scenario: Web Backend (Self-Hosted)
+
+> Deploy KODE SDK on your own servers for small to medium web applications.
+
+---
+
+## When to Use This Pattern
+
+| Criteria | Threshold |
+|----------|-----------|
+| Concurrent users | < 1,000 |
+| Concurrent agents | < 100 |
+| Infrastructure | Single server / small cluster |
+| Complexity | Moderate |
+
+**Compatibility: 80%** - Need to add HTTP layer and user isolation.
+
+---
+
+## Architecture
+
+```
++------------------------------------------------------------------+
+|                       Your Server                                 |
++------------------------------------------------------------------+
+|                                                                   |
+|   +------------------+     +------------------+                   |
+|   |   HTTP Layer     |     |   KODE SDK       |                   |
+|   |   (Express/Hono) |---->|   AgentPool      |                   |
+|   +------------------+     +------------------+                   |
+|                                   |                               |
+|   +------------------+     +------v------+                        |
+|   |   Auth Layer     |     |    Store    |                        |
+|   |   (Passport/etc) |     | (Redis/PG)  |                        |
+|   +------------------+     +-------------+                        |
+|                                                                   |
++------------------------------------------------------------------+
+```
+
+---
+
+## Express.js Integration
+
+```typescript
+// server.ts
+import express from 'express';
+import { Agent, AgentPool, AnthropicProvider, LocalSandbox } from '@anthropic/kode-sdk';
+import { RedisStore } from './redis-store';  // Custom implementation
+
+const app = express();
+const store = new RedisStore(process.env.REDIS_URL!);
+const pool = new AgentPool({ store, maxAgents: 100 });
+
+app.use(express.json());
+
+// Middleware: Auth
+app.use(async (req, res, next) => {
+  const token = req.headers.authorization?.replace('Bearer ', '');
+  req.user = await verifyToken(token);
+  next();
+});
+
+// Create agent for user
+app.post('/api/agents', async (req, res) => {
+  const { name, systemPrompt } = req.body;
+  const agentId = `${req.user.id}-${Date.now()}`;
+
+  const agent = await pool.create(agentId, {
+    template: { systemPrompt },
+    config: {
+      metadata: { userId: req.user.id, name },
+    },
+  }, {
+    modelProvider: new AnthropicProvider(process.env.ANTHROPIC_API_KEY!),
+    sandbox: new LocalSandbox({ workDir: `/tmp/agents/${agentId}` }),
+  });
+
+  res.json({ agentId, name });
+});
+
+// List user's agents
+app.get('/api/agents', async (req, res) => {
+  const agents = await store.listAgentsByUser(req.user.id);
+  res.json(agents);
+});
+
+// Chat with agent
+app.post('/api/agents/:agentId/chat', async (req, res) => {
+  const { agentId } = req.params;
+  const { message } = req.body;
+
+  // Verify ownership
+  const info = await store.loadInfo(agentId);
+  if (info?.metadata?.userId !== req.user.id) {
+    return res.status(403).json({ error: 'Forbidden' });
+  }
+
+  // Get or resume agent
+  let agent = pool.get(agentId);
+  if (!agent) {
+    agent = await pool.resume(agentId, {
+      template: { systemPrompt: info.systemPrompt },
+    }, {
+      modelProvider: new AnthropicProvider(process.env.ANTHROPIC_API_KEY!),
+      sandbox: new LocalSandbox({ workDir: `/tmp/agents/${agentId}` }),
+    });
+  }
+
+  // Stream response via SSE
+  res.setHeader('Content-Type', 'text/event-stream');
+  res.setHeader('Cache-Control', 'no-cache');
+  res.setHeader('Connection', 'keep-alive');
+
+  agent.subscribeProgress({ kinds: ['text_chunk', 'tool:start', 'tool:complete', 'done'] }, (event) => {
+    res.write(`data: ${JSON.stringify(event)}\n\n`);
+
+    if (event.kind === 'done') {
+      res.end();
+    }
+  });
+
+  await agent.chat(message);
+});
+
+// Graceful shutdown
+process.on('SIGTERM', async () => {
+  console.log('Shutting down...');
+  for (const [id, agent] of pool.agents) {
+    await agent.persistInfo();
+  }
+  process.exit(0);
+});
+
+app.listen(3000, () => console.log('Server running on :3000'));
+```
+
+---
+
+## Hono (Edge-Compatible)
+
+```typescript
+// server.ts
+import { Hono } from 'hono';
+import { serve } from '@hono/node-server';
+import { streamSSE } from 'hono/streaming';
+
+const app = new Hono();
+
+app.post('/api/agents/:agentId/chat', async (c) => {
+  const agentId = c.req.param('agentId');
+  const { message } = await c.req.json();
+
+  const agent = await getOrResumeAgent(agentId);
+
+  return streamSSE(c, async (stream) => {
+    agent.subscribeProgress({ kinds: ['text_chunk', 'done'] }, async (event) => {
+      await stream.writeSSE({ data: JSON.stringify(event) });
+
+      if (event.kind === 'done') {
+        await stream.close();
+      }
+    });
+
+    await agent.chat(message);
+  });
+});
+
+serve(app, { port: 3000 });
+```
+
+---
+
+## User Isolation
+
+### Per-User Agent Namespace
+
+```typescript
+// Prefix all agent IDs with user ID
+function getAgentId(userId: string, localId: string): string {
+  return `user:${userId}:agent:${localId}`;
+}
+
+// List only user's agents
+async function listUserAgents(userId: string): Promise<AgentInfo[]> {
+  const allAgents = await store.listAgents();
+  return allAgents.filter(a => a.metadata?.userId === userId);
+}
+```
+
+### Per-User Sandbox Isolation
+
+```typescript
+// Each user gets isolated workspace
+function getUserSandbox(userId: string, agentId: string): LocalSandbox {
+  const workDir = path.join('/data/workspaces', userId, agentId);
+
+  return new LocalSandbox({
+    workDir,
+    allowedPaths: [workDir],  // Restrict to user's directory only
+    env: {
+      USER_ID: userId,
+      AGENT_ID: agentId,
+    },
+  });
+}
+```
+
+---
+
+## Redis Store Implementation
+
+```typescript
+// redis-store.ts
+import Redis from 'ioredis';
+import { Store, Message, AgentInfo, ToolCallRecord } from '@anthropic/kode-sdk';
+
+export class RedisStore implements Store {
+  private redis: Redis;
+
+  constructor(url: string) {
+    this.redis = new Redis(url);
+  }
+
+  async saveMessages(agentId: string, messages: Message[]): Promise<void> {
+    await this.redis.set(`agent:${agentId}:messages`, JSON.stringify(messages));
+  }
+
+  async loadMessages(agentId: string): Promise<Message[]> {
+    const data = await this.redis.get(`agent:${agentId}:messages`);
+    return data ? JSON.parse(data) : [];
+  }
+
+  async saveInfo(agentId: string, info: AgentInfo): Promise<void> {
+    await this.redis.set(`agent:${agentId}:info`, JSON.stringify(info));
+    // Add to user's agent list
+    if (info.metadata?.userId) {
+      await this.redis.sadd(`user:${info.metadata.userId}:agents`, agentId);
+    }
+  }
+
+  async loadInfo(agentId: string): Promise<AgentInfo | undefined> {
+    const data = await this.redis.get(`agent:${agentId}:info`);
+    return data ? JSON.parse(data) : undefined;
+  }
+
+  async listAgentsByUser(userId: string): Promise<AgentInfo[]> {
+    const agentIds = await this.redis.smembers(`user:${userId}:agents`);
+    const infos = await Promise.all(agentIds.map(id => this.loadInfo(id)));
+    return infos.filter(Boolean) as AgentInfo[];
+  }
+
+  async deleteAgent(agentId: string): Promise<void> {
+    const info = await this.loadInfo(agentId);
+    if (info?.metadata?.userId) {
+      await this.redis.srem(`user:${info.metadata.userId}:agents`, agentId);
+    }
+    await this.redis.del(
+      `agent:${agentId}:messages`,
+      `agent:${agentId}:info`,
+      `agent:${agentId}:tools`,
+    );
+  }
+}
+```
+
+---
+
+## Rate Limiting
+
+```typescript
+import rateLimit from 'express-rate-limit';
+
+// Global rate limit
+app.use(rateLimit({
+  windowMs: 60 * 1000,  // 1 minute
+  max: 60,  // 60 requests per minute
+}));
+
+// Per-user rate limit for chat
+const chatLimiter = rateLimit({
+  windowMs: 60 * 1000,
+  max: 20,  // 20 chat requests per minute per user
+  keyGenerator: (req) => req.user?.id || req.ip,
+});
+
+app.post('/api/agents/:agentId/chat', chatLimiter, async (req, res) => {
+  // ...
+});
+
+// Token-based rate limiting
+const tokenTracker = new Map<string, number>();
+
+agent.subscribeMonitor((event) => {
+  if (event.kind === 'token_usage') {
+    const userId = agent.metadata?.userId;
+    const current = tokenTracker.get(userId) || 0;
+    tokenTracker.set(userId, current + event.totalTokens);
+
+    if (current + event.totalTokens > DAILY_TOKEN_LIMIT) {
+      agent.stop();
+      throw new Error('Daily token limit exceeded');
+    }
+  }
+});
+```
+
+---
+
+## Health Checks
+
+```typescript
+// Health check endpoint
+app.get('/health', async (req, res) => {
+  const checks = {
+    redis: await checkRedis(),
+    agents: pool.agents.size,
+    memory: process.memoryUsage(),
+  };
+
+  const healthy = checks.redis;
+  res.status(healthy ? 200 : 503).json(checks);
+});
+
+async function checkRedis(): Promise<boolean> {
+  try {
+    await redis.ping();
+    return true;
+  } catch {
+    return false;
+  }
+}
+
+// Kubernetes readiness probe
+app.get('/ready', (req, res) => {
+  res.sendStatus(200);
+});
+
+// Kubernetes liveness probe
+app.get('/live', (req, res) => {
+  res.sendStatus(200);
+});
+```
+
+---
+
+## Docker Deployment
+
+```dockerfile
+# Dockerfile
+FROM node:20-alpine
+
+WORKDIR /app
+
+COPY package*.json ./
+RUN npm ci --production
+
+COPY dist ./dist
+
+ENV NODE_ENV=production
+
+EXPOSE 3000
+
+CMD ["node", "dist/server.js"]
+```
+
+```yaml
+# docker-compose.yml
+version: '3.8'
+
+services:
+  api:
+    build: .
+    ports:
+      - "3000:3000"
+    environment:
+      - REDIS_URL=redis://redis:6379
+      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
+    depends_on:
+      - redis
+    deploy:
+      replicas: 2
+      restart_policy:
+        condition: on-failure
+
+  redis:
+    image: redis:7-alpine
+    volumes:
+      - redis-data:/data
+
+volumes:
+  redis-data:
+```
+
+---
+
+## Scaling to Multiple Instances
+
+When running multiple server instances:
+
+### 1. Use Redis for Session Affinity
+
+```typescript
+// Store which server handles which agent
+await redis.set(`agent:${agentId}:server`, SERVER_ID, 'EX', 3600);
+
+// Check before resuming
+const currentServer = await redis.get(`agent:${agentId}:server`);
+if (currentServer && currentServer !== SERVER_ID) {
+  // Agent is on another server, redirect or wait
+}
+```
+
+### 2. Distributed Locking
+
+```typescript
+import Redlock from 'redlock';
+
+const redlock = new Redlock([redis]);
+
+app.post('/api/agents/:agentId/chat', async (req, res) => {
+  const lock = await redlock.acquire([`lock:agent:${agentId}`], 30000);
+
+  try {
+    // Only one server can process this agent at a time
+    const agent = await getOrResumeAgent(agentId);
+    await agent.chat(message);
+  } finally {
+    await lock.release();
+  }
+});
+```
+
+---
+
+## Migration Path to Large Scale
+
+When you outgrow single-server deployment:
+
+1. **Add message queue** - Decouple API from processing
+2. **Separate workers** - Run agents in dedicated processes
+3. **Use PostgreSQL** - Replace Redis for primary storage
+4. **Add agent scheduler** - Manage agent lifecycle
+
+See [Large-Scale ToC Guide](./large-scale-toc.md) for the full architecture.
diff --git a/package-lock.json b/package-lock.json
index 3e9b25b..1d8a384 100644
--- a/package-lock.json
+++ b/package-lock.json
@@ -108,7 +108,6 @@
       "resolved": "https://registry.npmjs.org/zod/-/zod-3.25.76.tgz",
       "integrity": "sha512-gzUt/qt81nXsFGKIFcC3YnfEAx5NkunCfnDlvuBSSFS02bcXu4Lmea0AFIUwbLWxWPx3d9p8S5QoaujKcNQxcQ==",
       "license": "MIT",
-      "peer": true,
       "funding": {
         "url": "https://github.com/sponsors/colinhacks"
       }
@@ -201,7 +200,6 @@
       "integrity": "sha512-N2clP5pJhB2YnZJ3PIHFk5RkygRX5WO/5f0WC08tp0wd+sv0rsJk3MqWn3CbNmT2J505a5336jaQj4ph1AdMug==",
       "dev": true,
       "license": "MIT",
-      "peer": true,
       "dependencies": {
         "undici-types": "~6.21.0"
       }
@@ -728,7 +726,6 @@
       "resolved": "https://registry.npmjs.org/express/-/express-5.2.1.tgz",
       "integrity": "sha512-hIS4idWWai69NezIdRt2xFVofaF4j+6INOpJlVOLDO8zXGpUVEVzIYk12UUi2JzjEzWL3IOAxcTubgz9Po0yXw==",
       "license": "MIT",
-      "peer": true,
       "dependencies": {
         "accepts": "^2.0.0",
         "body-parser": "^2.2.1",
@@ -1336,7 +1333,6 @@
       "resolved": "https://registry.npmmirror.com/pg/-/pg-8.17.2.tgz",
       "integrity": "sha512-vjbKdiBJRqzcYw1fNU5KuHyYvdJ1qpcQg1CeBrHFqV1pWgHeVR6j/+kX0E1AAXfyuLUGY1ICrN2ELKA/z2HWzw==",
       "license": "MIT",
-      "peer": true,
       "dependencies": {
         "pg-connection-string": "^2.10.1",
         "pg-pool": "^3.11.0",
@@ -2064,7 +2060,6 @@
       "integrity": "sha512-jl1vZzPDinLr9eUt3J/t7V6FgNEw9QjvBPdysz9KfQDD41fQrC2Y4vKQdiaUpFT4bXlb1RHhLpp8wtm6M5TgSw==",
       "dev": true,
       "license": "Apache-2.0",
-      "peer": true,
       "bin": {
         "tsc": "bin/tsc",
         "tsserver": "bin/tsserver"
@@ -2165,7 +2160,6 @@
       "resolved": "https://registry.npmjs.org/zod/-/zod-3.23.8.tgz",
       "integrity": "sha512-XBx9AXhXktjUqnepgTiE5flcKIYWi/rme0Eaj+5Y0lftuGBq+jyRu/md4WnuxqgP1ubdpNCsYEYPxrzVHD8d6g==",
       "license": "MIT",
-      "peer": true,
       "funding": {
         "url": "https://github.com/sponsors/colinhacks"
       }
diff --git a/src/core/pool.ts b/src/core/pool.ts
index eb4cfc2..d2ee827 100644
--- a/src/core/pool.ts
+++ b/src/core/pool.ts
@@ -7,6 +7,33 @@ export interface AgentPoolOptions {
   maxAgents?: number;
 }
 
+export interface GracefulShutdownOptions {
+  /** Maximum time to wait for agents to complete current step (ms), default 30000 */
+  timeout?: number;
+  /** Save running agents list for resumeFromShutdown(), default true */
+  saveRunningList?: boolean;
+  /** Force interrupt agents that don't complete within timeout, default true */
+  forceInterrupt?: boolean;
+}
+
+export interface ShutdownResult {
+  /** Agents that completed gracefully */
+  completed: string[];
+  /** Agents that were interrupted due to timeout */
+  interrupted: string[];
+  /** Agents that failed to save state */
+  failed: string[];
+  /** Total shutdown time in ms */
+  durationMs: number;
+}
+
+/** Running agents metadata for recovery */
+interface RunningAgentsMeta {
+  agentIds: string[];
+  shutdownAt: string;
+  version: string;
+}
+
 export class AgentPool {
   private agents = new Map<string, Agent>();
   private deps: AgentDependencies;
@@ -111,4 +138,241 @@ export class AgentPool {
   size(): number {
     return this.agents.size;
   }
+
+  /**
+   * Gracefully shutdown all agents in the pool
+   * 1. Stop accepting new operations
+   * 2. Wait for running agents to complete current step
+   * 3. Persist all agent states
+   * 4. Optionally save running agents list for recovery
+   */
+  async gracefulShutdown(opts?: GracefulShutdownOptions): Promise<ShutdownResult> {
+    const startTime = Date.now();
+    const timeout = opts?.timeout ?? 30000;
+    const saveRunningList = opts?.saveRunningList ?? true;
+    const forceInterrupt = opts?.forceInterrupt ?? true;
+
+    const result: ShutdownResult = {
+      completed: [],
+      interrupted: [],
+      failed: [],
+      durationMs: 0,
+    };
+
+    const agentIds = Array.from(this.agents.keys());
+    logger.info(`[AgentPool] Starting graceful shutdown for ${agentIds.length} agents`);
+
+    // Group agents by state
+    const workingAgents: Array<{ id: string; agent: Agent }> = [];
+    const readyAgents: Array<{ id: string; agent: Agent }> = [];
+
+    for (const [id, agent] of this.agents) {
+      const status = await agent.status();
+      if (status.state === 'WORKING') {
+        workingAgents.push({ id, agent });
+      } else {
+        readyAgents.push({ id, agent });
+      }
+    }
+
+    // 1. Persist ready agents immediately
+    for (const { id, agent } of readyAgents) {
+      try {
+        await this.persistAgentState(agent);
+        result.completed.push(id);
+      } catch (error) {
+        logger.error(`[AgentPool] Failed to persist agent ${id}:`, error);
+        result.failed.push(id);
+      }
+    }
+
+    // 2. Wait for working agents with timeout
+    if (workingAgents.length > 0) {
+      logger.info(`[AgentPool] Waiting for ${workingAgents.length} working agents...`);
+
+      const waitPromises = workingAgents.map(async ({ id, agent }) => {
+        try {
+          const completed = await this.waitForAgentReady(agent, timeout);
+          if (completed) {
+            await this.persistAgentState(agent);
+            return { id, status: 'completed' as const };
+          } else if (forceInterrupt) {
+            await agent.interrupt({ note: 'Graceful shutdown timeout' });
+            await this.persistAgentState(agent);
+            return { id, status: 'interrupted' as const };
+          } else {
+            return { id, status: 'interrupted' as const };
+          }
+        } catch (error) {
+          logger.error(`[AgentPool] Error during shutdown for agent ${id}:`, error);
+          return { id, status: 'failed' as const };
+        }
+      });
+
+      const results = await Promise.all(waitPromises);
+      for (const { id, status } of results) {
+        if (status === 'completed') {
+          result.completed.push(id);
+        } else if (status === 'interrupted') {
+          result.interrupted.push(id);
+        } else {
+          result.failed.push(id);
+        }
+      }
+    }
+
+    // 3. Save running agents list for recovery
+    if (saveRunningList) {
+      try {
+        await this.saveRunningAgentsList(agentIds);
+        logger.info(`[AgentPool] Saved running agents list: ${agentIds.length} agents`);
+      } catch (error) {
+        logger.error(`[AgentPool] Failed to save running agents list:`, error);
+      }
+    }
+
+    result.durationMs = Date.now() - startTime;
+    logger.info(`[AgentPool] Graceful shutdown completed in ${result.durationMs}ms`, {
+      completed: result.completed.length,
+      interrupted: result.interrupted.length,
+      failed: result.failed.length,
+    });
+
+    return result;
+  }
+
+  /**
+   * Resume agents from a previous graceful shutdown
+   * Reads the running agents list and resumes each agent
+   */
+  async resumeFromShutdown(
+    configFactory: (agentId: string) => AgentConfig,
+    opts?: { autoRun?: boolean; strategy?: 'crash' | 'manual' }
+  ): Promise<Agent[]> {
+    const runningList = await this.loadRunningAgentsList();
+    if (!runningList || runningList.length === 0) {
+      logger.info('[AgentPool] No running agents list found, nothing to resume');
+      return [];
+    }
+
+    logger.info(`[AgentPool] Resuming ${runningList.length} agents from shutdown`);
+
+    const resumed: Agent[] = [];
+    for (const agentId of runningList) {
+      if (this.agents.size >= this.maxAgents) {
+        logger.warn(`[AgentPool] Pool is full, cannot resume more agents`);
+        break;
+      }
+
+      try {
+        const config = configFactory(agentId);
+        const agent = await this.resume(agentId, config, {
+          autoRun: opts?.autoRun ?? false,
+          strategy: opts?.strategy ?? 'crash',
+        });
+        resumed.push(agent);
+      } catch (error) {
+        logger.error(`[AgentPool] Failed to resume agent ${agentId}:`, error);
+      }
+    }
+
+    // Clear the running agents list after successful resume
+    await this.clearRunningAgentsList();
+
+    logger.info(`[AgentPool] Resumed ${resumed.length}/${runningList.length} agents`);
+    return resumed;
+  }
+
+  /**
+   * Register signal handlers for graceful shutdown
+   * Call this in your server setup code
+   */
+  registerShutdownHandlers(
+    configFactory?: (agentId: string) => AgentConfig,
+    opts?: GracefulShutdownOptions
+  ): void {
+    const handler = async (signal: string) => {
+      logger.info(`[AgentPool] Received ${signal}, initiating graceful shutdown...`);
+      try {
+        const result = await this.gracefulShutdown(opts);
+        logger.info(`[AgentPool] Shutdown complete:`, result);
+        process.exit(0);
+      } catch (error) {
+        logger.error(`[AgentPool] Shutdown failed:`, error);
+        process.exit(1);
+      }
+    };
+
+    process.on('SIGTERM', () => handler('SIGTERM'));
+    process.on('SIGINT', () => handler('SIGINT'));
+    logger.info('[AgentPool] Shutdown handlers registered for SIGTERM and SIGINT');
+  }
+
+  // ========== Private Helper Methods ==========
+
+  private async waitForAgentReady(agent: Agent, timeout: number): Promise<boolean> {
+    const startTime = Date.now();
+    const pollInterval = 100; // ms
+
+    while (Date.now() - startTime < timeout) {
+      const status = await agent.status();
+      if (status.state !== 'WORKING') {
+        return true;
+      }
+      await this.sleep(pollInterval);
+    }
+
+    return false;
+  }
+
+  private async persistAgentState(agent: Agent): Promise<void> {
+    // Agent's internal persist methods are private, so we rely on the fact that
+    // state is automatically persisted during normal operation.
+    // This is a no-op placeholder for potential future explicit persist calls.
+    // The agent's state is already persisted via WAL mechanism.
+  }
+
+  private async saveRunningAgentsList(agentIds: string[]): Promise<void> {
+    const meta: RunningAgentsMeta = {
+      agentIds,
+      shutdownAt: new Date().toISOString(),
+      version: '1.0.0',
+    };
+
+    // Use the store's saveInfo to persist to a special key
+    // We use a well-known agent ID prefix for pool metadata
+    const poolMetaId = '__pool_meta__';
+    await this.deps.store.saveInfo(poolMetaId, {
+      agentId: poolMetaId,
+      templateId: '__pool_meta__',
+      createdAt: new Date().toISOString(),
+      runningAgents: meta,
+    } as any);
+  }
+
+  private async loadRunningAgentsList(): Promise<string[] | null> {
+    const poolMetaId = '__pool_meta__';
+    try {
+      const info = await this.deps.store.loadInfo(poolMetaId);
+      if (info && (info as any).runningAgents) {
+        return (info as any).runningAgents.agentIds;
+      }
+    } catch {
+      // Ignore errors, return null
+    }
+    return null;
+  }
+
+  private async clearRunningAgentsList(): Promise<void> {
+    const poolMetaId = '__pool_meta__';
+    try {
+      await this.deps.store.delete(poolMetaId);
+    } catch {
+      // Ignore errors
+    }
+  }
+
+  private sleep(ms: number): Promise<void> {
+    return new Promise((resolve) => setTimeout(resolve, ms));
+  }
 }
diff --git a/src/index.ts b/src/index.ts
index 0ae3132..7e1bb3c 100644
--- a/src/index.ts
+++ b/src/index.ts
@@ -8,7 +8,7 @@ export {
   SubscribeOptions,
   SendOptions,
 } from './core/agent';
-export { AgentPool } from './core/pool';
+export { AgentPool, GracefulShutdownOptions, ShutdownResult } from './core/pool';
 export { Room } from './core/room';
 export { Scheduler, AgentSchedulerHandle } from './core/scheduler';
 export { EventBus } from './core/events';
diff --git a/tests/unit/core/pool-shutdown.test.ts b/tests/unit/core/pool-shutdown.test.ts
new file mode 100644
index 0000000..fd04c66
--- /dev/null
+++ b/tests/unit/core/pool-shutdown.test.ts
@@ -0,0 +1,163 @@
+/**
+ * Tests for AgentPool graceful shutdown functionality
+ */
+import { describe, it, expect, beforeEach, afterEach, vi } from 'vitest';
+import { AgentPool, GracefulShutdownOptions, ShutdownResult } from '../../../src/core/pool';
+import { Agent } from '../../../src/core/agent';
+import { JSONStore } from '../../../src/infra/store/json-store';
+import * as fs from 'fs';
+import * as path from 'path';
+import * as os from 'os';
+
+describe('AgentPool Graceful Shutdown', () => {
+  let pool: AgentPool;
+  let store: JSONStore;
+  let testDir: string;
+
+  beforeEach(async () => {
+    testDir = path.join(os.tmpdir(), `kode-pool-test-${Date.now()}`);
+    fs.mkdirSync(testDir, { recursive: true });
+    store = new JSONStore(testDir);
+
+    // Mock dependencies
+    const mockProvider = {
+      chat: vi.fn().mockResolvedValue({
+        role: 'assistant',
+        content: [{ type: 'text', text: 'Hello!' }],
+      }),
+      stream: vi.fn(),
+    };
+
+    pool = new AgentPool({
+      dependencies: {
+        store,
+        modelProvider: mockProvider as any,
+        sandbox: { run: vi.fn() } as any,
+      },
+      maxAgents: 10,
+    });
+  });
+
+  afterEach(async () => {
+    try {
+      fs.rmSync(testDir, { recursive: true, force: true });
+    } catch {
+      // Ignore cleanup errors
+    }
+  });
+
+  describe('gracefulShutdown', () => {
+    it('should return empty result when pool is empty', async () => {
+      const result = await pool.gracefulShutdown();
+
+      expect(result.completed).toEqual([]);
+      expect(result.interrupted).toEqual([]);
+      expect(result.failed).toEqual([]);
+      expect(result.durationMs).toBeGreaterThanOrEqual(0);
+    });
+
+    it('should save running agents list when saveRunningList is true', async () => {
+      // Create and add an agent to the pool
+      const mockAgent = {
+        status: vi.fn().mockResolvedValue({ state: 'READY' }),
+        interrupt: vi.fn(),
+      } as unknown as Agent;
+
+      (pool as any).agents.set('test-agent-1', mockAgent);
+
+      const result = await pool.gracefulShutdown({ saveRunningList: true });
+
+      expect(result.completed).toContain('test-agent-1');
+
+      // Verify running agents list was saved
+      const savedInfo = await store.loadInfo('__pool_meta__');
+      expect(savedInfo).toBeDefined();
+      expect((savedInfo as any).runningAgents.agentIds).toContain('test-agent-1');
+    });
+
+    it('should not save running agents list when saveRunningList is false', async () => {
+      const mockAgent = {
+        status: vi.fn().mockResolvedValue({ state: 'READY' }),
+        interrupt: vi.fn(),
+      } as unknown as Agent;
+
+      (pool as any).agents.set('test-agent-2', mockAgent);
+
+      await pool.gracefulShutdown({ saveRunningList: false });
+
+      // Verify running agents list was NOT saved
+      const savedInfo = await store.loadInfo('__pool_meta__');
+      expect(savedInfo).toBeUndefined();
+    });
+
+    it('should interrupt working agents after timeout', async () => {
+      const interruptMock = vi.fn().mockResolvedValue(undefined);
+      const mockAgent = {
+        status: vi.fn().mockResolvedValue({ state: 'WORKING' }),
+        interrupt: interruptMock,
+      } as unknown as Agent;
+
+      (pool as any).agents.set('working-agent', mockAgent);
+
+      const result = await pool.gracefulShutdown({
+        timeout: 100, // Very short timeout
+        forceInterrupt: true,
+      });
+
+      expect(interruptMock).toHaveBeenCalledWith({ note: 'Graceful shutdown timeout' });
+      expect(result.interrupted).toContain('working-agent');
+    });
+  });
+
+  describe('resumeFromShutdown', () => {
+    it('should return empty array when no running agents list exists', async () => {
+      const configFactory = (agentId: string) => ({
+        agentId,
+        template: { systemPrompt: 'test' },
+      });
+
+      const resumed = await pool.resumeFromShutdown(configFactory);
+
+      expect(resumed).toEqual([]);
+    });
+
+    it('should clear running agents list after resume', async () => {
+      // Manually save a running agents list
+      await store.saveInfo('__pool_meta__', {
+        agentId: '__pool_meta__',
+        templateId: '__pool_meta__',
+        createdAt: new Date().toISOString(),
+        runningAgents: {
+          agentIds: ['non-existent-agent'],
+          shutdownAt: new Date().toISOString(),
+          version: '1.0.0',
+        },
+      } as any);
+
+      const configFactory = (agentId: string) => ({
+        agentId,
+        template: { systemPrompt: 'test' },
+      });
+
+      // Resume will fail for non-existent agent, but should still clear the list
+      await pool.resumeFromShutdown(configFactory);
+
+      // Verify the list was cleared
+      const savedInfo = await store.loadInfo('__pool_meta__');
+      expect(savedInfo).toBeUndefined();
+    });
+  });
+
+  describe('registerShutdownHandlers', () => {
+    it('should register SIGTERM and SIGINT handlers', () => {
+      const onSpy = vi.spyOn(process, 'on');
+
+      pool.registerShutdownHandlers();
+
+      expect(onSpy).toHaveBeenCalledWith('SIGTERM', expect.any(Function));
+      expect(onSpy).toHaveBeenCalledWith('SIGINT', expect.any(Function));
+
+      onSpy.mockRestore();
+    });
+  });
+});