diff --git a/README.md b/README.md index 4e9b8d1..1e07094 100644 --- a/README.md +++ b/README.md @@ -1,499 +1,284 @@ -# KODE SDK · Event-Driven Agent Runtime +# KODE SDK -> **就像和资深同事协作**:发消息、批示、打断、分叉、续上 —— 一套最小而够用的 API,驱动长期在线的多 Agent 系统。 - -## Why KODE - -- **Event-First**:UI 只订阅 Progress(文本/工具流);审批与治理走 Control & Monitor 回调,默认不推噪音事件。 -- **长时运行 + 可分叉**:七段断点恢复(READY → POST_TOOL),Safe-Fork-Point 天然存在于工具结果与纯文本处,一键 fork 继续。 -- **同事式协作心智**:Todo 管理、提醒、Tool 手册自动注入,工具并发可限流,默认配置即安全可用。 -- **高性能且可审计**:统一 WAL、零拷贝文本流、工具拒绝必落审计、Monitor 事件覆盖 token 用量、错误与文件变更。 -- **可扩展生态**:原生接入 MCP 工具、Sandbox 驱动、模型 Provider、Store 后端、Scheduler DSL,支持企业级自定义。 - -## 60 秒上手:跑通第一个“协作收件箱” - -```bash -npm install @shareai-lab/kode-sdk -export ANTHROPIC_API_KEY=sk-... # 或 ANTHROPIC_API_TOKEN -export ANTHROPIC_BASE_URL=https://... # 可选,默认为官方 API -export ANTHROPIC_MODEL_ID=claude-sonnet-4.5-20250929 # 可选 - -npm run example:agent-inbox -``` - -输出中你会看到: - -- Progress 渠道实时流式文本 / 工具生命周期事件 -- Control 渠道的审批请求(示例中默认拒绝 `bash_run`) -- Monitor 渠道的工具审计日志(耗时、审批结果、错误) - -想自定义行为?修改 `examples/01-agent-inbox.ts` 内的模板、工具与事件订阅即可。 - -## 示例游乐场 - -| 示例 | 用例 | 涵盖能力 | -| --- | --- | --- | -| `npm run example:getting-started` | 最小对话循环 | Progress 流订阅、Anthropic 模型直连 | -| `npm run example:agent-inbox` | 事件驱动收件箱 | Todo 管理、工具并发、Monitor 审计 | -| `npm run example:approval` | 工具审批工作流 | Control 回调、Hook 策略、自动拒绝/放行 | -| `npm run example:room` | 多 Agent 协作 | AgentPool、Room 消息、Safe Fork、Lineage | -| `npm run example:scheduler` | 长时运行 & 提醒 | Scheduler 步数触发、系统提醒、FilePool 监控 | -| `npm run example:nextjs` | API + SSE | Resume-or-create、Progress 流推送(无需安装 Next) | - -每个示例都位于 `examples/` 下,对应 README 中的学习路径,展示事件驱动、审批、安全、调度、协作等核心能力的组合方式。 - -## 构建属于你的协作型 Agent - -1. **理解三通道心智**:详见 [`docs/events.md`](./docs/events.md)。 -2. **跟着 Quickstart 实战**:[`docs/quickstart.md`](./docs/quickstart.md) 从 “依赖注入 → Resume → SSE” 手把手搭建服务。 -3. **扩展用例**:[`docs/playbooks.md`](./docs/playbooks.md) 涵盖审批治理、多 Agent 小组、调度提醒等典型场景。 -4. **查阅 API**:[`docs/api.md`](./docs/api.md) 枚举 `Agent`、`EventBus`、`ToolRegistry` 等核心类型与事件。 -5. **深挖能力**:Todo、ContextManager、Scheduler、Sandbox、Hook、Tool 定义详见 `docs/` 目录。 - -## 基础设计一图流 +> **Stateful Agent Runtime Kernel** - The engine that powers your AI agents with persistence, recovery, and trajectory exploration. ``` -Client/UI ── subscribe(['progress']) ──┐ -Approval service ── Control 回调 ─────┼▶ EventBus(三通道) -Observability ── Monitor 事件 ────────┘ - - │ - ▼ - MessageQueue → ContextManager → ToolRunner - │ │ │ - ▼ ▼ ▼ - Store (WAL) FilePool PermissionManager + +------------------+ + | Your App | CLI / Desktop / IDE / Server + +--------+---------+ + | + +--------v---------+ + | KODE SDK | Agent Runtime Kernel + | +-----------+ | + | | Agent | | Lifecycle + State + Events + | +-----------+ | + | | Store | | Persistence (Pluggable) + | +-----------+ | + | | Sandbox | | Execution Isolation + | +-----------+ | + +------------------+ ``` -## Provider Adapter Pattern - -KODE SDK uses **Anthropic-style messages as the internal canonical format**. All model providers are implemented as adapters that convert to/from this internal format, enabling seamless switching between providers while maintaining a consistent message structure. - -### Internal Message Format +--- -```typescript -interface Message { - role: 'user' | 'assistant' | 'system'; - content: ContentBlock[]; - metadata?: { - content_blocks?: ContentBlock[]; - transport?: 'provider' | 'text' | 'omit'; - }; -} - -type ContentBlock = - | { type: 'text'; text: string } - | { type: 'reasoning'; reasoning: string; meta?: { signature?: string } } - | { type: 'image'; base64?: string; url?: string; mime_type?: string; file_id?: string } - | { type: 'audio'; base64?: string; url?: string; mime_type?: string } - | { type: 'file'; base64?: string; url?: string; filename?: string; mime_type?: string; file_id?: string } - | { type: 'tool_use'; id: string; name: string; input: any; meta?: Record } - | { type: 'tool_result'; tool_use_id: string; content: any; is_error?: boolean }; -``` +## What is KODE SDK? -### Message Flow +KODE SDK is an **Agent Runtime Kernel** - think of it like V8 for JavaScript, but for AI agents. It handles the complex lifecycle management so you can focus on building your agent's capabilities. -``` -Internal Message[] (Anthropic-style ContentBlocks) - -> Provider.formatMessages() -> External API format - -> API call - -> Response -> normalizeContent() -> Internal ContentBlock[] -``` +**Core Capabilities:** +- **Crash Recovery**: WAL-protected persistence with 7-stage breakpoint recovery +- **Fork & Resume**: Explore different agent trajectories from any checkpoint +- **Event Streams**: Progress/Control/Monitor channels for real-time UI updates +- **Tool Governance**: Permission system, approval workflows, audit trails -### Supported Providers - -| Provider | API | Thinking Support | Files | Streaming | -|----------|-----|------------------|-------|-----------| -| Anthropic | Messages API | Extended Thinking (`interleaved-thinking-2025-05-14`) | Files API | SSE | -| OpenAI | Chat Completions | `` tags | - | SSE | -| OpenAI | Responses API | `reasoning_effort` (store, previous_response_id) | File uploads | SSE | -| Gemini | Generate Content | `thinkingLevel` (3.x) | GCS URIs | SSE | -| DeepSeek | Chat Completions | `reasoning_content` (auto-strip from history) | - | SSE | -| Qwen | Chat Completions | `thinking_budget` | - | SSE | -| GLM | Chat Completions | `thinking.type: enabled` | - | SSE | -| Minimax | Chat Completions | `reasoning_split` | - | SSE | -| Kimi K2 | Chat Completions | `reasoning` field | - | SSE | -| Groq/Cerebras | Chat Completions | OpenAI-compatible | - | SSE | +**What KODE SDK is NOT:** +- Not a cloud platform (you deploy it) +- Not an HTTP server (you add that layer) +- Not a multi-tenant SaaS framework (you build that on top) --- -## Provider Message Conversion Details +## When to Use KODE SDK -### Anthropic Provider +### Perfect Fit (Use directly) -**API Format**: Anthropic Messages API with extended thinking beta +| Scenario | Why It Works | +|----------|--------------| +| **CLI Agent Tools** | Single process, local filesystem, zero config | +| **Desktop Apps** (Electron/Tauri) | Full system access, long-running process | +| **IDE Plugins** (VSCode/JetBrains) | Single user, workspace integration | +| **Local Development** | Fast iteration, instant persistence | -```typescript -// Internal -> Anthropic -{ - role: 'user' | 'assistant', - content: [ - { type: 'text', text: '...' }, - { type: 'thinking', thinking: '...', signature?: '...' }, // Extended thinking - { type: 'image', source: { type: 'base64', media_type: '...', data: '...' } }, - { type: 'document', source: { type: 'file', file_id: '...' } }, // Files API - { type: 'tool_use', id: '...', name: '...', input: {...} }, - { type: 'tool_result', tool_use_id: '...', content: '...' } - ] -} -``` +### Good Fit (With architecture) -**Extended Thinking Configuration**: -```typescript -const provider = new AnthropicProvider(apiKey, model, baseUrl, proxyUrl, { - reasoningTransport: 'provider', // 'provider' | 'text' | 'omit' - thinking: { - enabled: true, - budgetTokens: 10000 // Maps to thinking.budget_tokens - } -}); -``` +| Scenario | What You Need | +|----------|---------------| +| **Self-hosted Server** | Add HTTP layer (Express/Fastify/Hono) | +| **Small-scale Backend** (<1K users) | Implement PostgresStore, add user isolation | +| **Kubernetes Deployment** | Implement distributed Store + locks | -**Beta Headers**: Automatically added based on message content: -- `interleaved-thinking-2025-05-14`: When `reasoningTransport === 'provider'` -- `files-api-2025-04-14`: When messages contain file blocks with `file_id` +### Needs Custom Architecture -**Signature Preservation**: Critical for multi-turn conversations with Claude 4+. The SDK preserves thinking block signatures in `meta.signature`. +| Scenario | Recommended Approach | +|----------|---------------------| +| **Large-scale ToC** (10K+ users) | Worker microservice pattern (see [Architecture Guide](./docs/ARCHITECTURE.md)) | +| **Serverless** (Vercel/Cloudflare) | API layer on serverless + Worker pool for agents | +| **Multi-tenant SaaS** | Tenant isolation layer + distributed Store | ---- +### Not Designed For -### OpenAI Provider (Chat Completions API) +| Scenario | Reason | +|----------|--------| +| **Pure browser runtime** | No filesystem, no process execution | +| **Edge functions only** | Agent loops need long-running processes | +| **Stateless microservices** | Agents are inherently stateful | -**API Format**: OpenAI Chat Completions +> **Rule of Thumb**: If your agents need to run for more than a few seconds, execute tools, and remember state - KODE SDK is for you. If you just need stateless LLM calls, use the provider APIs directly. -```typescript -// Internal -> OpenAI Chat -{ - role: 'system' | 'user' | 'assistant' | 'tool', - content: '...' | [{ type: 'text', text: '...' }, { type: 'image_url', image_url: { url: '...' } }], - tool_calls?: [{ id: '...', type: 'function', function: { name: '...', arguments: '...' } }], - reasoning_content?: '...', // DeepSeek/GLM/Qwen/Kimi - reasoning_details?: [{ text: '...' }] // Minimax -} -``` +--- + +## 60-Second Quick Start -**Configuration-Driven Reasoning** (NEW in v2.7): +```bash +npm install @anthropic/kode-sdk -The OpenAI provider now uses a configuration-driven approach instead of hardcoded provider detection: +# Set your API key +export ANTHROPIC_API_KEY=sk-... -```typescript -// ReasoningConfig interface -interface ReasoningConfig { - fieldName?: 'reasoning_content' | 'reasoning_details'; // Response field name - requestParams?: Record; // Enable reasoning in request - stripFromHistory?: boolean; // DeepSeek requirement -} +# Run the example +npx ts-node examples/getting-started.ts ``` -**Provider-Specific Configuration Examples**: - ```typescript -// DeepSeek (must strip reasoning from history) -const deepseekProvider = new OpenAIProvider(apiKey, 'deepseek-reasoner', baseUrl, undefined, { - reasoningTransport: 'provider', - reasoning: { - fieldName: 'reasoning_content', - stripFromHistory: true, // Critical: DeepSeek returns 400 if reasoning in history +import { Agent, AnthropicProvider, LocalSandbox } from '@anthropic/kode-sdk'; + +const agent = await Agent.create({ + agentId: 'my-first-agent', + template: { systemPrompt: 'You are a helpful assistant.' }, + deps: { + modelProvider: new AnthropicProvider(process.env.ANTHROPIC_API_KEY!), + sandbox: new LocalSandbox({ workDir: './workspace' }), }, }); -// GLM (thinking.type parameter) -const glmProvider = new OpenAIProvider(apiKey, 'glm-4.7', baseUrl, undefined, { - reasoningTransport: 'provider', - reasoning: { - fieldName: 'reasoning_content', - requestParams: { thinking: { type: 'enabled', clear_thinking: false } }, - }, -}); - -// Minimax (reasoning_split parameter, reasoning_details field) -const minimaxProvider = new OpenAIProvider(apiKey, 'minimax-moe-01', baseUrl, undefined, { - reasoningTransport: 'provider', - reasoning: { - fieldName: 'reasoning_details', - requestParams: { reasoning_split: true }, - }, +// Subscribe to events +agent.subscribeProgress({ kinds: ['text_chunk'] }, (event) => { + process.stdout.write(event.text); }); -// Qwen (enable_thinking parameter) -const qwenProvider = new OpenAIProvider(apiKey, 'qwen3-max', baseUrl, undefined, { - reasoningTransport: 'provider', - reasoning: { - fieldName: 'reasoning_content', - requestParams: { enable_thinking: true, thinking_budget: 10000 }, - stripFromHistory: true, // Similar to DeepSeek - }, -}); - -// Kimi K2 (reasoning parameter) -const kimiProvider = new OpenAIProvider(apiKey, 'kimi-k2-thinking', baseUrl, undefined, { - reasoningTransport: 'provider', - reasoning: { - fieldName: 'reasoning_content', - requestParams: { reasoning: 'enabled' }, - }, -}); +// Chat with the agent +await agent.chat('Hello! What can you help me with?'); ``` -**Tool Message Conversion**: -```typescript -// Internal tool_result -> OpenAI tool message -{ role: 'tool', tool_call_id: '...', content: '...', name: '...' } -``` - -**Image Handling**: -- URL images: `{ type: 'image_url', image_url: { url: 'https://...' } }` -- Base64 images: `{ type: 'image_url', image_url: { url: 'data:image/png;base64,...' } }` - -**Reasoning Transport**: -- `text`: Reasoning wrapped as `...` in text content -- `provider`: Uses provider-specific fields (configured via `reasoning.fieldName`) - --- -### OpenAI Provider (Responses API) +## Core Concepts -**API Format**: OpenAI Responses API (GPT-5.x reasoning models) +### 1. Three-Channel Event System -```typescript -// Internal -> OpenAI Responses -{ - model: 'gpt-5.2', - input: [ - { role: 'user', content: [{ type: 'input_text', text: '...' }] }, - { role: 'assistant', content: [{ type: 'output_text', text: '...' }] } - ], - reasoning?: { effort: 'medium' }, // none | minimal | low | medium | high | xhigh - store?: true, // Enable state persistence - previous_response_id?: 'resp_...' // Multi-turn continuation -} ``` - -**File Handling**: -```typescript -// File with ID -{ type: 'input_file', file_id: '...' } -// File with URL -{ type: 'input_file', file_url: '...' } -// File with base64 -{ type: 'input_file', filename: '...', file_data: 'data:application/pdf;base64,...' } ++-------------+ +-------------+ +-------------+ +| Progress | | Control | | Monitor | ++-------------+ +-------------+ +-------------+ +| text_chunk | | permission | | tool_audit | +| tool:start | | _required | | state_change| +| tool:complete| | approval | | token_usage | +| done | | _response | | error | ++-------------+ +-------------+ +-------------+ + | | | + v v v + Your UI Approval Service Observability ``` -**Configuration** (NEW in v2.7): -```typescript -const provider = new OpenAIProvider(apiKey, 'gpt-5.2', baseUrl, undefined, { - api: 'responses', // Use Responses API instead of Chat Completions - responses: { - reasoning: { effort: 'high' }, // Reasoning effort level - store: true, // Enable response storage for continuation - previousResponseId: 'resp_abc123', // Resume from previous response - }, -}); +### 2. Crash Recovery & Breakpoints + ``` +Agent Execution Flow: -**Multi-Turn Conversation Flow**: -```typescript -// First request -const response1 = await provider.complete(messages); -const responseId = response1.metadata?.responseId; // Store for continuation + READY -> PRE_MODEL -> STREAMING -> TOOL_PENDING -> PRE_TOOL -> EXECUTING -> POST_TOOL + | | | | | | | + +-------- WAL Protected State -------+-- Approval --+---- Tool Execution ---+ -// Second request (continues from first) -provider.configure({ - responses: { ...provider.getConfig().responses, previousResponseId: responseId } -}); -const response2 = await provider.complete(newMessages); +On crash: Resume from last safe breakpoint, auto-seal incomplete tool calls ``` ---- +### 3. Fork & Trajectory Exploration -### Gemini Provider +```typescript +// Create a checkpoint at current state +const checkpointId = await agent.checkpoint('before-decision'); -**API Format**: Gemini Generate Content API +// Fork to explore different paths +const explorerA = await agent.fork(checkpointId); +const explorerB = await agent.fork(checkpointId); -```typescript -// Internal -> Gemini -{ - contents: [ - { - role: 'user' | 'model', - parts: [ - { text: '...' }, - { inline_data: { mime_type: '...', data: '...' } }, // Base64 images/files - { file_data: { mime_type: '...', file_uri: 'gs://...' } }, // GCS files - { functionCall: { name: '...', args: {...} } }, - { functionResponse: { name: '...', response: { content: '...' } } } - ] - } - ], - systemInstruction?: { parts: [{ text: '...' }] }, - tools?: [{ functionDeclarations: [...] }], - generationConfig?: { - temperature: 0.7, - maxOutputTokens: 4096, - thinkingConfig: { thinkingLevel: 'HIGH' } // Gemini 3.x - } -} +await explorerA.chat('Try approach A'); +await explorerB.chat('Try approach B'); ``` -**Role Mapping**: -- `assistant` -> `model` -- `user` -> `user` -- `system` -> `systemInstruction` +--- -**Thinking Configuration** (Gemini 3.x): -```typescript -const provider = new GeminiProvider(apiKey, model, baseUrl, proxyUrl, { - thinking: { - level: 'high' // minimal | low | medium | high -> MINIMAL | LOW | MEDIUM | HIGH - } -}); -``` +## Examples -**Tool Schema Sanitization**: Gemini requires clean JSON Schema without: -- `additionalProperties` -- `$schema` -- `$defs` -- `definitions` +| Example | Description | Key Features | +|---------|-------------|--------------| +| `npm run example:getting-started` | Minimal chat loop | Progress stream, basic setup | +| `npm run example:agent-inbox` | Event-driven inbox | Todo management, tool concurrency | +| `npm run example:approval` | Approval workflow | Control channel, hooks, policies | +| `npm run example:room` | Multi-agent collaboration | AgentPool, Room, Fork | +| `npm run example:scheduler` | Long-running with reminders | Scheduler, step triggers | +| `npm run example:nextjs` | Next.js API integration | Resume-or-create, SSE streaming | --- -### High-Speed Inference Providers (Groq/Cerebras) +## Architecture for Scale -Both Groq and Cerebras provide OpenAI-compatible APIs with extremely fast inference speeds: +For production deployments serving many users, we recommend the **Worker Microservice Pattern**: -**Groq** (LPU Inference Engine): -```typescript -const groqProvider = new OpenAIProvider(apiKey, 'llama-3.3-70b-versatile', undefined, undefined, { - baseUrl: 'https://api.groq.com/openai/v1', // Auto-appends /v1 - // Reasoning models (Qwen 3 32B, QwQ-32B) - reasoning: { - fieldName: 'reasoning_content', - requestParams: { reasoning_format: 'parsed', reasoning_effort: 'default' }, - }, -}); -// Speed: ~276 tokens/second for Llama 3.3 70B ``` - -**Cerebras** (Wafer-Scale Engine): -```typescript -const cerebrasProvider = new OpenAIProvider(apiKey, 'qwen-3-32b', undefined, undefined, { - baseUrl: 'https://api.cerebras.ai/v1', - reasoning: { - fieldName: 'reasoning_content', - requestParams: { reasoning_format: 'separate' }, - }, -}); -// Speed: ~2,600 tokens/second for Qwen3 32B + +------------------+ + | Frontend | Next.js / SvelteKit (Vercel OK) + +--------+---------+ + | + +--------v---------+ + | API Gateway | Auth, routing, queue push + +--------+---------+ + | + +--------v---------+ + | Message Queue | Redis / SQS / NATS + +--------+---------+ + | + +--------------------+--------------------+ + | | | + +--------v-------+ +--------v-------+ +--------v-------+ + | Worker 1 | | Worker 2 | | Worker N | + | (KODE SDK) | | (KODE SDK) | | (KODE SDK) | + | Long-running | | Long-running | | Long-running | + +--------+-------+ +--------+-------+ +--------+-------+ + | | | + +--------------------+--------------------+ + | + +--------v---------+ + | Distributed Store| PostgreSQL / Redis + +------------------+ ``` -**Key Features**: -- OpenAI-compatible Chat Completions API -- Tool calling with `strict` mode (Cerebras) -- Streaming support with SSE -- Rate limits: 50 RPM (Cerebras), higher for Groq paid tiers - ---- - -## ReasoningTransport Options +**Key Principles:** +1. **API layer is stateless** - Can run on serverless +2. **Workers are stateful** - Run KODE SDK, need long-running processes +3. **Store is shared** - Single source of truth for agent state +4. **Queue decouples** - Request handling from agent execution -Controls how thinking/reasoning content is handled across providers: - -| Transport | Description | Use Case | -|-----------|-------------|----------| -| `provider` | Native provider format (Anthropic thinking blocks, OpenAI reasoning tokens) | Full thinking visibility, multi-turn conversations | -| `text` | Wrapped in `...` tags as text | Cross-provider compatibility, text-based pipelines | -| `omit` | Excluded from message history | Privacy, token reduction | +See [docs/ARCHITECTURE.md](./docs/ARCHITECTURE.md) for detailed deployment guides. --- -## Prompt Caching +## Documentation -The SDK supports prompt caching across multiple providers for significant cost savings: +| Document | Description | +|----------|-------------| +| [Architecture Guide](./docs/ARCHITECTURE.md) | Mental model, deployment patterns, scaling strategies | +| [Quickstart](./docs/quickstart.md) | Step-by-step first agent | +| [Events System](./docs/events.md) | Three-channel event model | +| [API Reference](./docs/api.md) | Core types and interfaces | +| [Playbooks](./docs/playbooks.md) | Common patterns and recipes | +| [Deployment](./docs/DEPLOYMENT.md) | Production deployment guide | +| [Roadmap](./docs/ROADMAP.md) | Future development plans | -| Provider | Caching Type | Min Tokens | TTL | Savings | -|----------|--------------|------------|-----|---------| -| Anthropic | Explicit (`cache_control`) | 1024-4096 | 5m/1h | 90% | -| OpenAI | Automatic | 1024 | 24h | 75% | -| Gemini | Implicit + Explicit | 256-4096 | Custom | 75% | -| DeepSeek | Automatic prefix | 64 | Hours | 90% | -| Qwen | Explicit (`cache_control`) | 1024 | 5m | 90% | +### Scenario Guides -**Anthropic Cache Example**: -```typescript -const provider = new AnthropicProvider(apiKey, 'claude-sonnet-4.5', baseUrl, undefined, { - beta: { extendedCacheTtl: true }, // Enable 1-hour TTL - cache: { breakpoints: 4, defaultTtl: '1h' }, -}); +| Scenario | Guide | +|----------|-------| +| CLI Tools | [docs/scenarios/cli-tools.md](./docs/scenarios/cli-tools.md) | +| Desktop Apps | [docs/scenarios/desktop-apps.md](./docs/scenarios/desktop-apps.md) | +| IDE Plugins | [docs/scenarios/ide-plugins.md](./docs/scenarios/ide-plugins.md) | +| Web Backend | [docs/scenarios/web-backend.md](./docs/scenarios/web-backend.md) | +| Large-scale ToC | [docs/scenarios/large-scale-toc.md](./docs/scenarios/large-scale-toc.md) | -// Mark content for caching in messages -const messages = [{ - role: 'user', - content: [{ - type: 'text', - text: 'Large document...', - cacheControl: { type: 'ephemeral', ttl: '1h' } - }] -}]; -``` +--- -**Usage Tracking**: -```typescript -const response = await provider.complete(messages); -console.log(response.usage); -// { -// input_tokens: 100, -// cache_creation_input_tokens: 50000, // First request -// cache_read_input_tokens: 50000, // Subsequent requests -// output_tokens: 500 -// } -``` +## Supported Providers + +| Provider | Streaming | Tool Calling | Thinking/Reasoning | +|----------|-----------|--------------|-------------------| +| **Anthropic** | SSE | Native | Extended Thinking | +| **OpenAI** | SSE | Function Calling | o1/o3 reasoning | +| **Gemini** | SSE | Function Calling | thinkingLevel | +| **DeepSeek** | SSE | OpenAI-compatible | reasoning_content | +| **Qwen** | SSE | OpenAI-compatible | thinking_budget | +| **Groq/Cerebras** | SSE | OpenAI-compatible | - | --- -## Session Compression & Resume +## Roadmap -The SDK's context management works with all providers: +### v2.8 - Storage Foundation +- PostgresStore with connection pooling +- Distributed locking (Advisory Lock) +- Graceful shutdown support -1. **Message Windowing**: Automatically manages context window limits per provider -2. **Safe Fork Points**: Natural breakpoints at tool results and pure text responses -3. **Reasoning Preservation**: Thinking blocks can be: - - Preserved with signatures (Anthropic) - - Converted to text tags (cross-provider) - - Omitted for compression +### v3.0 - Performance +- Incremental message storage (append-only) +- Copy-on-Write fork optimization +- Event sampling and aggregation -```typescript -// Resume with thinking context -const agent = await Agent.resume(sessionId, { - provider: new AnthropicProvider(apiKey, model, baseUrl, undefined, { - reasoningTransport: 'provider' // Preserve thinking for continuation - }) -}); -``` +### v3.5 - Distributed +- Agent Scheduler with LRU caching +- Distributed EventBus (Redis Pub/Sub) +- Worker mode helpers ---- +See [docs/ROADMAP.md](./docs/ROADMAP.md) for the complete roadmap. -## Testing Providers +--- -Run integration tests with real API connections: +## Contributing -```bash -# Configure credentials -cp .env.test.example .env.test -# Edit .env.test with your API keys +We welcome contributions! Please see [CONTRIBUTING.md](./CONTRIBUTING.md) for guidelines. -# Run all provider tests -npm test -- --testPathPattern="multi-provider" +## License -# Run specific provider -npm test -- --testPathPattern="multi-provider" --testNamePattern="Anthropic" -``` +MIT License - see [LICENSE](./LICENSE) for details. --- -## 下一步 - -- 使用 `examples/` 作为蓝本接入你自己的工具、存储、审批系统。 -- 将 Monitor 事件接入现有 observability 平台,沉淀治理与审计能力。 -- 参考 `docs/` 中的扩展指南,为企业自定义 Sandbox、模型 Provider 或多团队 Agent 协作流程。 - -欢迎在 Issue / PR 中分享反馈与场景诉求,让 KODE SDK 更贴近真实协作团队的需求。 +**KODE SDK** - *The runtime kernel that lets you build agents that persist, recover, and explore.* diff --git a/docs/ARCHITECTURE.md b/docs/ARCHITECTURE.md new file mode 100644 index 0000000..b4fd278 --- /dev/null +++ b/docs/ARCHITECTURE.md @@ -0,0 +1,572 @@ +# KODE SDK Architecture Guide + +> Deep dive into the mental model, design decisions, and deployment patterns for KODE SDK. + +--- + +## Table of Contents + +1. [Mental Model](#mental-model) +2. [Core Architecture](#core-architecture) +3. [Runtime Characteristics](#runtime-characteristics) +4. [Deployment Patterns](#deployment-patterns) +5. [Scaling Strategies](#scaling-strategies) +6. [Decision Framework](#decision-framework) + +--- + +## Mental Model + +### What KODE SDK Is + +``` +Think of KODE SDK like: + ++------------------+ +------------------+ +------------------+ +| V8 | | SQLite | | KODE SDK | +| JS Runtime | | Database Engine | | Agent Runtime | ++------------------+ +------------------+ +------------------+ + | | | + v v v ++------------------+ +------------------+ +------------------+ +| Express.js | | Prisma | | Your App | +| Web Framework | | ORM | | (CLI/Desktop/Web)| ++------------------+ +------------------+ +------------------+ + | | | + v v v ++------------------+ +------------------+ +------------------+ +| Vercel | | PlanetScale | | Your Infra | +| Cloud Platform | | Cloud Database | | (K8s/EC2/Local) | ++------------------+ +------------------+ +------------------+ +``` + +**KODE SDK is an engine, not a platform.** + +It provides: +- Agent lifecycle management (create, run, pause, resume, fork) +- State persistence (via pluggable Store interface) +- Tool execution and permission governance +- Event streams for observability + +It does NOT provide: +- HTTP routing or API framework +- User authentication or authorization +- Multi-tenancy or resource isolation +- Horizontal scaling or load balancing + +### The Single Responsibility + +``` + KODE SDK's Job + | + v + +----------------------------------------------+ + | | + | "Keep this agent running, recover from | + | crashes, let it fork, and tell me | + | what's happening via events." | + | | + +----------------------------------------------+ + | + v + Your App's Job + | + v + +----------------------------------------------+ + | | + | "Handle users, route requests, manage | + | permissions, scale infrastructure, | + | and integrate with my systems." | + | | + +----------------------------------------------+ +``` + +--- + +## Core Architecture + +### Component Overview + +``` ++------------------------------------------------------------------+ +| Agent Instance | ++------------------------------------------------------------------+ +| | +| +------------------+ +------------------+ +------------------+ | +| | MessageQueue | | ContextManager | | ToolRunner | | +| | (User inputs) | | (Token mgmt) | | (Parallel exec) | | +| +--------+---------+ +--------+---------+ +--------+---------+ | +| | | | | +| +---------------------+---------------------+ | +| | | +| +------------v------------+ | +| | BreakpointManager | | +| | (7-stage state track) | | +| +------------+------------+ | +| | | +| +------------------+ +--------v---------+ +------------------+ | +| | PermissionManager| | EventBus | | TodoManager | | +| | (Approval flow) | | (3-channel emit) | | (Task tracking) | | +| +------------------+ +------------------+ +------------------+ | +| | ++----------------------------------+--------------------------------+ + | + +--------------+--------------+ + | | | + +--------v------+ +----v----+ +-------v-------+ + | Store | | Sandbox | | ModelProvider | + | (Persistence) | | (Exec) | | (LLM calls) | + +---------------+ +---------+ +---------------+ +``` + +### Data Flow + +``` +User Message + | + v ++----+----+ +-----------+ +------------+ +| Message |---->| Context |---->| Model | +| Queue | | Manager | | Provider | ++---------+ +-----------+ +-----+------+ + | + +---------+---------+ + | | + Text Response Tool Calls + | | + v v + +---------+------+ +------+-------+ + | EventBus | | ToolRunner | + | (text_chunk) | | (parallel) | + +----------------+ +------+-------+ + | + +------------------+------------------+ + | | | + Permission Execution Result + Check (Sandbox) Handling + | | | + v v v + +--------------------+ +---------+ +------------------+ + | PermissionManager | | Sandbox | | EventBus | + | (Control channel) | | (exec) | | (tool:complete) | + +--------------------+ +---------+ +------------------+ +``` + +### State Persistence (WAL) + +``` +Every State Change + | + v ++-------+-------+ +| Write-Ahead | +| Log | <-- Write first (fast, append-only) ++-------+-------+ + | + v ++-------+-------+ +| Main File | <-- Then update (can be slow) ++-------+-------+ + | + v ++-------+-------+ +| Delete WAL | <-- Finally cleanup ++-------+-------+ + +On Crash Recovery: +1. Scan for WAL files +2. If WAL exists but main file incomplete -> Restore from WAL +3. Delete WAL after successful restore +``` + +--- + +## Runtime Characteristics + +### Memory Model + +``` +Agent Memory Footprint (Typical): + ++---------------------------+ +| Agent Instance | ++---------------------------+ +| messages[]: 10KB - 2MB | <-- Grows with conversation +| toolRecords: 1KB - 100KB | <-- Grows with tool usage +| eventTimeline: 5KB - 500KB| <-- Recent events cached +| mediaCache: 0 - 10MB | <-- If images/files involved +| baseObjects: ~50KB | <-- Fixed overhead ++---------------------------+ + +Typical range: 100KB - 5MB per agent +AgentPool (50 agents): 5MB - 250MB +``` + +### I/O Patterns + +``` +Per Agent Step: + ++-------------------+ +-------------------+ +-------------------+ +| persistMessages() | | persistToolRecs() | | emitEvents() | +| ~20-50ms (SSD) | | ~5-10ms | | ~1-5ms (buffered) | ++-------------------+ +-------------------+ +-------------------+ + +Total per step: 30-70ms I/O overhead + +At Scale (100 concurrent agents): +- Sequential bottleneck in JSONStore +- Need distributed Store for parallel writes +``` + +### Event Loop Impact + +``` +Agent Processing: + + +---------+ + | IDLE | <-- Agent waiting for input + +----+----+ + | + +----v----+ + | PROCESS | <-- Model call (async, non-blocking) + +----+----+ + | + +----v----+ + | TOOL | <-- Tool execution (may block if sync) + +----+----+ + | + +----v----+ + | PERSIST | <-- File I/O (async) + +----+----+ + | + v + +---------+ + | IDLE | + +---------+ + +Key: All heavy operations are async +Risk: Sync operations in custom tools can block event loop +``` + +--- + +## Deployment Patterns + +### Pattern 1: Single Process (CLI/Desktop) + +``` ++------------------------------------------------------------------+ +| Your Application | ++------------------------------------------------------------------+ +| | +| +------------------+ | +| | KODE SDK | | +| | +----------+ | | +| | | Agent(s) | | | +| | +----------+ | | +| | | JSONStore| | --> Local filesystem | +| | +----------+ | | +| +------------------+ | +| | ++------------------------------------------------------------------+ + +Best for: CLI tools, Electron apps, VSCode extensions +Agents: 1-50 concurrent +Users: Single user +Persistence: Local files +``` + +### Pattern 2: Single Server (Self-hosted) + +``` ++------------------------------------------------------------------+ +| Server | ++------------------------------------------------------------------+ +| | +| +------------------+ +------------------+ | +| | HTTP Layer | | KODE SDK | | +| | (Express/etc) |---->| AgentPool | | +| +------------------+ +------------------+ | +| | | +| +------v------+ | +| | JSONStore | --> Local filesystem | +| +-------------+ | +| | ++------------------------------------------------------------------+ + +Best for: Internal tools, small teams, prototypes +Agents: 10-100 concurrent +Users: <100 concurrent +Persistence: Local files (can use Redis/Postgres) +``` + +### Pattern 3: Worker Microservice (Scalable) + +``` ++------------------------------------------------------------------+ +| Load Balancer | ++----------------------------------+--------------------------------+ + | + +-------------------------+-------------------------+ + | | | ++--------v--------+ +----------v--------+ +----------v------+ +| API Server 1 | | API Server 2 | | API Server N | +| (Stateless) | | (Stateless) | | (Stateless) | ++--------+--------+ +----------+--------+ +----------+------+ + | | | + +-------------------------+-------------------------+ + | + +--------v--------+ + | Message Queue | + | (Redis/SQS) | + +--------+--------+ + | + +-------------------------+-------------------------+ + | | | ++--------v--------+ +----------v--------+ +----------v------+ +| Worker 1 | | Worker 2 | | Worker N | +| +----------+ | | +----------+ | | +----------+ | +| | KODE SDK | | | | KODE SDK | | | | KODE SDK | | +| | AgentPool| | | | AgentPool| | | | AgentPool| | +| +----------+ | | +----------+ | | +----------+ | ++--------+--------+ +----------+--------+ +----------+------+ + | | | + +-------------------------+-------------------------+ + | + +--------v--------+ + | Distributed | + | Store | + | (PostgreSQL) | + +-----------------+ + +Best for: Production ToC apps, SaaS platforms +Agents: 1000+ concurrent +Users: 10K+ concurrent +Persistence: PostgreSQL/Redis with distributed locks +``` + +### Pattern 4: Hybrid Serverless (API + Workers) + +``` ++------------------------------------------------------------------+ +| Serverless Platform (Vercel) | ++------------------------------------------------------------------+ +| | +| +------------------+ | +| | /api/chat | --> Validate, enqueue, return task ID | +| +------------------+ | +| | /api/status | --> Check task status from DB | +| +------------------+ | +| | /api/stream | --> SSE from Redis Pub/Sub | +| +------------------+ | +| | ++----------------------------------+--------------------------------+ + | + +--------v--------+ + | Message Queue | + | (Upstash Redis)| + +--------+--------+ + | ++----------------------------------v--------------------------------+ +| Worker Platform (Railway/Render) | ++------------------------------------------------------------------+ +| | +| +------------------+ | +| | Worker Pool | | +| | +----------+ | | +| | | KODE SDK | | | +| | | Agents | | | +| | +----------+ | | +| +------------------+ | +| | ++------------------------------------------------------------------+ + +Best for: Serverless frontend + stateful backend +API: Serverless (fast, scalable, cheap) +Agents: Long-running workers (Railway, Render, Fly.io) +``` + +--- + +## Scaling Strategies + +### Strategy 1: Vertical Scaling (Single Node) + +``` +Applicable: Up to ~100 concurrent agents + +Optimizations: +1. Increase AgentPool maxAgents +2. Use Redis for Store (faster than files) +3. Add memory (agents are memory-bound) +4. Use SSD for persistence + +const pool = new AgentPool({ + maxAgents: 100, // Increase from default 50 + store: new RedisStore({ ... }), +}); +``` + +### Strategy 2: Agent Sharding (Multi-Node) + +``` +Applicable: 100-1000 concurrent agents + +Architecture: +- Hash agentId to determine which worker handles it +- Consistent hashing for minimal reshuffling +- Each worker owns a shard of agents + + agentId: "user-123-agent-456" + | + v + hash(agentId) % N = worker_index + | + +---------------+---------------+ + | | | + Worker 0 Worker 1 Worker 2 + (agents 0-33) (agents 34-66) (agents 67-99) +``` + +### Strategy 3: Agent Scheduling (LRU) + +``` +Applicable: 1000+ total agents, limited active + +Concept: +- Not all agents are active simultaneously +- Keep hot agents in memory +- Hibernate cold agents to storage +- Resume on demand + +class AgentScheduler { + private active: LRUCache; // In memory + private hibernated: Set; // In storage + + async get(agentId: string): Promise { + // Check active cache + if (this.active.has(agentId)) { + return this.active.get(agentId); + } + + // Resume from storage + const agent = await Agent.resume(agentId, config, deps); + this.active.set(agentId, agent); + + // LRU eviction handles hibernation + return agent; + } +} +``` + +### Strategy 4: Fork Optimization (COW) + +``` +Applicable: Heavy fork usage (exploration scenarios) + +Current: O(n) deep copy of messages +Optimized: O(1) copy-on-write + +Before: + fork() { + const forked = JSON.parse(JSON.stringify(messages)); // O(n) + } + +After: + fork() { + const forkedHead = currentHead; // O(1) pointer copy + // Messages are immutable, share until modified + } +``` + +--- + +## Decision Framework + +### When to Use KODE SDK Directly + +``` ++------------------+ +| Decision Tree | ++------------------+ + | + v ++------------------+ +| Single user/ |----YES---> Use directly (Pattern 1) +| local machine? | ++--------+---------+ + | NO + v ++------------------+ +| < 100 concurrent |----YES---> Single server (Pattern 2) +| users? | ++--------+---------+ + | NO + v ++------------------+ +| Can run long- |----YES---> Worker microservice (Pattern 3) +| running processes?| ++--------+---------+ + | NO + v ++------------------+ +| Serverless only? |----YES---> Hybrid pattern (Pattern 4) ++--------+---------+ + | NO + v ++------------------+ +| Consider other | +| solutions | ++------------------+ +``` + +### Platform Compatibility Matrix + +| Platform | Compatible | Notes | +|----------|------------|-------| +| Node.js | 100% | Primary target | +| Bun | 95% | Minor adjustments needed | +| Deno | 80% | Permission flags required | +| Electron | 90% | Use in main process | +| VSCode Extension | 85% | workspace.fs integration | +| Vercel Functions | 20% | API layer only, not agents | +| Cloudflare Workers | 5% | Not compatible | +| Browser | 10% | No fs/process, very limited | + +### Store Selection Guide + +| Store | Use Case | Throughput | Scaling | +|-------|----------|------------|---------| +| JSONStore | Development, CLI | Low | Single node | +| SQLiteStore | Desktop apps | Medium | Single node | +| RedisStore | Small-medium production | High | Single node | +| PostgresStore | Production, multi-node | High | Multi-node | + +--- + +## Summary + +### Core Principles + +1. **KODE SDK is a runtime kernel** - It manages agent lifecycle, not application infrastructure + +2. **Agents are stateful** - They need persistent storage and long-running processes + +3. **Scale through architecture** - Use worker patterns for large-scale deployments + +4. **Store is pluggable** - Implement custom Store for your infrastructure + +### Quick Reference + +| Scenario | Pattern | Store | Scale | +|----------|---------|-------|-------| +| CLI tool | Single Process | JSONStore | 1 user | +| Desktop app | Single Process | SQLiteStore | 1 user | +| Internal tool | Single Server | RedisStore | ~100 users | +| SaaS product | Worker Microservice | PostgresStore | 10K+ users | +| Serverless app | Hybrid | External DB | Varies | + +--- + +*Next: See [Deployment Guide](./DEPLOYMENT.md) for implementation details.* diff --git a/docs/DEPLOYMENT.md b/docs/DEPLOYMENT.md new file mode 100644 index 0000000..4a03a1f --- /dev/null +++ b/docs/DEPLOYMENT.md @@ -0,0 +1,638 @@ +# Deployment Scenarios & Architecture Patterns + +This document covers deployment patterns for KODE SDK across different use cases, from CLI tools to production backends. + +--- + +## Scenario Overview + +| Scenario | Complexity | Store | Scalability | Example | +|----------|-----------|-------|-------------|---------| +| CLI Tool | Low | JSONStore | Single user | Claude Code | +| Desktop App | Low | JSONStore | Single user | ChatGPT Desktop | +| IDE Plugin | Low | JSONStore | Single user | Cursor | +| Self-hosted Server | Medium | JSONStore/Custom | ~100 concurrent | Internal tool | +| Production Backend | High | PostgreSQL/Redis | 1000+ concurrent | SaaS product | +| Serverless | High | External DB | Auto-scaling | API service | + +--- + +## Scenario 1: CLI Tool + +**Characteristics:** +- Single user, single process +- Local file system available +- Long-running process +- No external dependencies needed + +**Architecture:** +``` +┌─────────────────────────────┐ +│ Terminal │ +│ ┌───────────────────────┐ │ +│ │ CLI App │ │ +│ │ ┌─────────────────┐ │ │ +│ │ │ KODE SDK │ │ │ +│ │ │ ┌───────────┐ │ │ │ +│ │ │ │ JSONStore │ │ │ │ +│ │ │ └─────┬─────┘ │ │ │ +│ │ └────────┼────────┘ │ │ +│ └───────────┼───────────┘ │ +└──────────────┼──────────────┘ + │ + ┌──────▼──────┐ + │ Local Files │ + │ ~/.my-agent │ + └─────────────┘ +``` + +**Implementation:** +```typescript +import { Agent, AgentPool, JSONStore } from '@shareai-lab/kode-sdk'; + +const store = new JSONStore(path.join(os.homedir(), '.my-agent')); +const pool = new AgentPool({ + dependencies: { store, templateRegistry, sandboxFactory, toolRegistry } +}); + +// Resume or create +const agent = await pool.get('main') + ?? await pool.create('main', { templateId: 'cli-assistant' }); + +// Interactive loop +const rl = readline.createInterface({ input: stdin, output: stdout }); +for await (const line of rl) { + await agent.send(line); + for await (const event of agent.subscribeProgress()) { + if (event.type === 'text_chunk') process.stdout.write(event.delta); + } +} +``` + +**Best for:** Developer tools, automation scripts, personal assistants. + +--- + +## Scenario 2: Desktop App (Electron) + +**Characteristics:** +- Single user +- Full file system access +- Can run background processes +- May need multiple agents + +**Architecture:** +``` +┌────────────────────────────────────────────┐ +│ Electron App │ +│ ┌──────────────────────────────────────┐ │ +│ │ Renderer Process │ │ +│ │ ┌──────────────────────────────┐ │ │ +│ │ │ React UI │ │ │ +│ │ └──────────────┬───────────────┘ │ │ +│ └─────────────────┼────────────────────┘ │ +│ │ IPC │ +│ ┌─────────────────▼────────────────────┐ │ +│ │ Main Process │ │ +│ │ ┌──────────────────────────────┐ │ │ +│ │ │ AgentPool │ │ │ +│ │ │ ┌────────┐ ┌────────┐ │ │ │ +│ │ │ │Agent 1 │ │Agent 2 │ ... │ │ │ +│ │ │ └────────┘ └────────┘ │ │ │ +│ │ └──────────────────────────────┘ │ │ +│ │ ┌──────────────────────────────┐ │ │ +│ │ │ JSONStore │ │ │ +│ │ └──────────────┬───────────────┘ │ │ +│ └─────────────────┼────────────────────┘ │ +└────────────────────┼────────────────────────┘ + │ + ┌──────▼──────┐ + │ userData │ + │ folder │ + └─────────────┘ +``` + +**Implementation:** +```typescript +// main.ts (Main Process) +import { AgentPool, JSONStore } from '@shareai-lab/kode-sdk'; +import { app, ipcMain } from 'electron'; + +const store = new JSONStore(path.join(app.getPath('userData'), 'agents')); +const pool = new AgentPool({ dependencies: { store, ... } }); + +ipcMain.handle('agent:send', async (event, { agentId, message }) => { + const agent = pool.get(agentId) ?? await pool.create(agentId, config); + await agent.send(message); + return agent.complete(); +}); + +ipcMain.on('agent:subscribe', (event, { agentId }) => { + const agent = pool.get(agentId); + if (!agent) return; + + (async () => { + for await (const ev of agent.subscribeProgress()) { + event.sender.send(`agent:event:${agentId}`, ev); + } + })(); +}); +``` + +**Best for:** Chat applications, productivity tools, AI assistants. + +--- + +## Scenario 3: Self-hosted Server (Single Node) + +**Characteristics:** +- Multiple users +- Persistent server process +- Can use local storage +- Moderate concurrency (<100 users) + +**Architecture:** +``` +┌──────────────────────────────────────────────────┐ +│ Node.js Server │ +│ ┌────────────────────────────────────────────┐ │ +│ │ Express/Hono │ │ +│ │ ┌──────────────────────────────────────┐ │ │ +│ │ │ /api/agents/:id/message (POST) │ │ │ +│ │ │ /api/agents/:id/events (SSE) │ │ │ +│ │ └──────────────────────────────────────┘ │ │ +│ └────────────────────┬───────────────────────┘ │ +│ │ │ +│ ┌────────────────────▼───────────────────────┐ │ +│ │ AgentPool (50) │ │ +│ │ ┌────┐ ┌────┐ ┌────┐ ┌────┐ ┌────┐ │ │ +│ │ │ A1 │ │ A2 │ │ A3 │ │... │ │A50 │ │ │ +│ │ └────┘ └────┘ └────┘ └────┘ └────┘ │ │ +│ └────────────────────┬───────────────────────┘ │ +│ │ │ +│ ┌────────────────────▼───────────────────────┐ │ +│ │ JSONStore │ │ +│ └────────────────────┬───────────────────────┘ │ +└───────────────────────┼──────────────────────────┘ + │ + ┌──────▼──────┐ + │ /data/ │ + │ agents │ + └─────────────┘ +``` + +**Implementation:** +```typescript +import { Hono } from 'hono'; +import { streamSSE } from 'hono/streaming'; +import { AgentPool, JSONStore } from '@shareai-lab/kode-sdk'; + +const app = new Hono(); +const store = new JSONStore('/data/agents'); +const pool = new AgentPool({ dependencies: { store, ... }, maxAgents: 50 }); + +// Send message +app.post('/api/agents/:id/message', async (c) => { + const { id } = c.req.param(); + const { message } = await c.req.json(); + + let agent = pool.get(id); + if (!agent) { + const exists = await store.exists(id); + agent = exists + ? await pool.resume(id, getConfig()) + : await pool.create(id, getConfig()); + } + + await agent.send(message); + const result = await agent.complete(); + return c.json(result); +}); + +// SSE events +app.get('/api/agents/:id/events', async (c) => { + const { id } = c.req.param(); + const agent = pool.get(id); + if (!agent) return c.json({ error: 'Agent not found' }, 404); + + return streamSSE(c, async (stream) => { + for await (const event of agent.subscribeProgress()) { + await stream.writeSSE({ data: JSON.stringify(event) }); + } + }); +}); + +export default app; +``` + +**Scaling Limit:** ~50-100 concurrent agents per process. Beyond this, consider worker architecture. + +--- + +## Scenario 4: Production Backend (Multi-node) + +**Characteristics:** +- High concurrency (1000+ users) +- Multiple server instances +- Database-backed persistence +- Queue-based processing + +**Architecture:** +``` +┌─────────────────────────────────────────────────────────────────┐ +│ Load Balancer │ +└────────────────────────────┬────────────────────────────────────┘ + │ + ┌───────────────────┼───────────────────┐ + │ │ │ +┌────────▼────────┐ ┌────────▼────────┐ ┌────────▼────────┐ +│ API Server 1 │ │ API Server 2 │ │ API Server N │ +│ (Stateless) │ │ (Stateless) │ │ (Stateless) │ +└────────┬────────┘ └────────┬────────┘ └────────┬────────┘ + │ │ │ + └───────────────────┼───────────────────┘ + │ + ┌────────▼────────┐ + │ Job Queue │ + │ (BullMQ) │ + └────────┬────────┘ + │ + ┌───────────────────┼───────────────────┐ + │ │ │ +┌────────▼────────┐ ┌────────▼────────┐ ┌────────▼────────┐ +│ Worker 1 │ │ Worker 2 │ │ Worker N │ +│ AgentPool(50) │ │ AgentPool(50) │ │ AgentPool(50) │ +└────────┬────────┘ └────────┬────────┘ └────────┬────────┘ + │ │ │ + └───────────────────┼───────────────────┘ + │ + ┌──────────────┼──────────────┐ + │ │ │ + ┌──────▼──────┐ ┌─────▼─────┐ ┌─────▼─────┐ + │ PostgreSQL │ │ Redis │ │ S3 │ + │ (Store) │ │ (Cache) │ │ (Files) │ + └─────────────┘ └───────────┘ └───────────┘ +``` + +**API Server Implementation:** +```typescript +// api/routes/agent.ts +import { Queue } from 'bullmq'; + +const queue = new Queue('agent-tasks', { connection: redis }); + +app.post('/api/agents/:id/message', async (c) => { + const { id } = c.req.param(); + const { message } = await c.req.json(); + + // Add job to queue + const job = await queue.add('process-message', { + agentId: id, + message, + userId: c.get('userId'), + }); + + return c.json({ jobId: job.id, status: 'queued' }); +}); + +app.get('/api/agents/:id/events', async (c) => { + const { id } = c.req.param(); + + // Subscribe to Redis pub/sub + return streamSSE(c, async (stream) => { + const sub = redis.duplicate(); + await sub.subscribe(`agent:${id}:events`); + + sub.on('message', (channel, message) => { + stream.writeSSE({ data: message }); + }); + }); +}); +``` + +**Worker Implementation:** +```typescript +// worker/index.ts +import { Worker } from 'bullmq'; +import { AgentPool } from '@shareai-lab/kode-sdk'; +import { PostgresStore } from './postgres-store'; + +const store = new PostgresStore(pgPool); +const pool = new AgentPool({ dependencies: { store, ... }, maxAgents: 50 }); + +const worker = new Worker('agent-tasks', async (job) => { + const { agentId, message } = job.data; + + // Get or resume agent + let agent = pool.get(agentId); + if (!agent) { + const exists = await store.exists(agentId); + agent = exists + ? await pool.resume(agentId, getConfig(job.data)) + : await pool.create(agentId, getConfig(job.data)); + } + + // Process + await agent.send(message); + + // Stream events to Redis + for await (const event of agent.subscribeProgress()) { + await redis.publish(`agent:${agentId}:events`, JSON.stringify(event)); + + if (event.type === 'done') break; + } +}, { connection: redis }); + +// Periodic cleanup: hibernate idle agents +setInterval(async () => { + for (const agentId of pool.list()) { + const agent = pool.get(agentId); + if (agent && agent.idleTime > 60_000) { + await agent.persistInfo(); + pool.delete(agentId); + } + } +}, 30_000); +``` + +**PostgreSQL Store Implementation:** +```sql +-- Schema +CREATE TABLE agents ( + id TEXT PRIMARY KEY, + template_id TEXT NOT NULL, + created_at TIMESTAMPTZ DEFAULT NOW(), + updated_at TIMESTAMPTZ DEFAULT NOW() +); + +CREATE TABLE agent_messages ( + agent_id TEXT PRIMARY KEY REFERENCES agents(id), + messages JSONB NOT NULL, + updated_at TIMESTAMPTZ DEFAULT NOW() +); + +CREATE TABLE agent_tool_records ( + agent_id TEXT PRIMARY KEY REFERENCES agents(id), + records JSONB NOT NULL, + updated_at TIMESTAMPTZ DEFAULT NOW() +); + +CREATE TABLE agent_events ( + id BIGSERIAL PRIMARY KEY, + agent_id TEXT REFERENCES agents(id), + channel TEXT NOT NULL, + event JSONB NOT NULL, + created_at TIMESTAMPTZ DEFAULT NOW() +); + +CREATE INDEX idx_agent_events_agent_channel ON agent_events(agent_id, channel, id); +``` + +--- + +## Scenario 5: Serverless (Vercel/Lambda) + +**Characteristics:** +- Request-scoped execution +- Cold starts +- Execution time limits (10s-300s) +- No local persistence + +**Challenges:** +1. **No File System**: JSONStore won't work +2. **Timeout**: Long agent tasks may exceed limits +3. **Cold Start**: Must load state quickly +4. **Stateless**: No in-memory agent pool + +**Architecture:** +``` +┌──────────────────────────────────────────────────────────────┐ +│ Vercel/Lambda │ +│ ┌────────────────────────────────────────────────────────┐ │ +│ │ API Function │ │ +│ │ │ │ +│ │ Request → Load Agent → Execute Step → Persist → Response │ +│ │ │ │ +│ └──────────────────────────┬─────────────────────────────┘ │ +└─────────────────────────────┼────────────────────────────────┘ + │ + ┌──────────────┼──────────────┐ + │ │ │ + ┌──────▼──────┐ ┌─────▼─────┐ ┌─────▼─────┐ + │ Supabase │ │ Upstash │ │ Inngest │ + │ (Postgres) │ │ (Redis) │ │ (Queue) │ + └─────────────┘ └───────────┘ └───────────┘ +``` + +**Implementation:** +```typescript +// app/api/agent/[id]/route.ts +import { Agent, AgentConfig } from '@shareai-lab/kode-sdk'; +import { SupabaseStore } from '@/lib/supabase-store'; + +const store = new SupabaseStore(supabaseClient); + +export async function POST(req: Request, { params }: { params: { id: string } }) { + const { message } = await req.json(); + const agentId = params.id; + + // 1. Load or create agent + const exists = await store.exists(agentId); + const agent = exists + ? await Agent.resume(agentId, config, { store, ... }) + : await Agent.create({ ...config, agentId }, { store, ... }); + + // 2. Send message + await agent.send(message); + + // 3. Run with timeout (leave buffer for response) + const timeoutMs = 25_000; // Vercel Pro = 30s + let result; + + try { + result = await Promise.race([ + agent.complete(), + new Promise((_, reject) => + setTimeout(() => reject(new Error('Timeout')), timeoutMs) + ), + ]); + } catch (e) { + if (e.message === 'Timeout') { + // Agent still processing, queue for background + await inngest.send('agent/continue', { agentId }); + return Response.json({ status: 'processing', agentId }); + } + throw e; + } + + return Response.json({ status: 'done', output: result.text }); +} +``` + +**For Long-running Tasks:** +Use a queue service like Inngest: + +```typescript +// inngest/functions/agent-continue.ts +import { inngest } from '@/lib/inngest'; + +export const agentContinue = inngest.createFunction( + { id: 'agent-continue' }, + { event: 'agent/continue' }, + async ({ event, step }) => { + const { agentId } = event.data; + + // Resume agent + const agent = await Agent.resume(agentId, config, { store, ... }); + + // Process until done (Inngest handles timeouts) + const result = await step.run('complete', async () => { + return agent.complete(); + }); + + // Notify user via webhook/push + await step.run('notify', async () => { + await notifyUser(agentId, result); + }); + + return result; + } +); +``` + +--- + +## Custom Store Implementations + +### PostgreSQL Store (Full Example) + +```typescript +import { Store, Message, ToolCallRecord, Timeline, Bookmark, ... } from '@shareai-lab/kode-sdk'; +import { Pool } from 'pg'; + +export class PostgresStore implements Store { + constructor(private pool: Pool) {} + + // === Runtime State === + + async saveMessages(agentId: string, messages: Message[]): Promise { + await this.pool.query(` + INSERT INTO agent_messages (agent_id, messages, updated_at) + VALUES ($1, $2, NOW()) + ON CONFLICT (agent_id) DO UPDATE SET messages = $2, updated_at = NOW() + `, [agentId, JSON.stringify(messages)]); + } + + async loadMessages(agentId: string): Promise { + const { rows } = await this.pool.query( + 'SELECT messages FROM agent_messages WHERE agent_id = $1', + [agentId] + ); + return rows[0]?.messages || []; + } + + async saveToolCallRecords(agentId: string, records: ToolCallRecord[]): Promise { + await this.pool.query(` + INSERT INTO agent_tool_records (agent_id, records, updated_at) + VALUES ($1, $2, NOW()) + ON CONFLICT (agent_id) DO UPDATE SET records = $2, updated_at = NOW() + `, [agentId, JSON.stringify(records)]); + } + + async loadToolCallRecords(agentId: string): Promise { + const { rows } = await this.pool.query( + 'SELECT records FROM agent_tool_records WHERE agent_id = $1', + [agentId] + ); + return rows[0]?.records || []; + } + + // === Events === + + async appendEvent(agentId: string, timeline: Timeline): Promise { + await this.pool.query(` + INSERT INTO agent_events (agent_id, channel, event, created_at) + VALUES ($1, $2, $3, NOW()) + `, [agentId, timeline.event.channel, JSON.stringify(timeline)]); + } + + async *readEvents(agentId: string, opts?: { since?: Bookmark; channel?: string }): AsyncIterable { + const conditions = ['agent_id = $1']; + const params: any[] = [agentId]; + let paramIndex = 2; + + if (opts?.since) { + conditions.push(`id > $${paramIndex++}`); + params.push(opts.since.seq); + } + if (opts?.channel) { + conditions.push(`channel = $${paramIndex++}`); + params.push(opts.channel); + } + + const { rows } = await this.pool.query(` + SELECT event FROM agent_events + WHERE ${conditions.join(' AND ')} + ORDER BY id ASC + `, params); + + for (const row of rows) { + yield row.event; + } + } + + // ... implement remaining methods (history, snapshots, metadata, lifecycle) + + async exists(agentId: string): Promise { + const { rows } = await this.pool.query( + 'SELECT 1 FROM agents WHERE id = $1', + [agentId] + ); + return rows.length > 0; + } + + async delete(agentId: string): Promise { + await this.pool.query('DELETE FROM agents WHERE id = $1', [agentId]); + } + + async list(prefix?: string): Promise { + const { rows } = await this.pool.query( + prefix + ? 'SELECT id FROM agents WHERE id LIKE $1' + : 'SELECT id FROM agents', + prefix ? [`${prefix}%`] : [] + ); + return rows.map(r => r.id); + } +} +``` + +--- + +## Capacity Planning + +| Deployment | Agents/Process | Memory/Agent | Concurrent Users | +|------------|----------------|--------------|------------------| +| CLI | 1 | 10-100 MB | 1 | +| Desktop | 5-10 | 50-200 MB | 1 | +| Single Server | 50 | 2-10 MB | 50-100 | +| Worker Cluster (10 nodes) | 500 | 2-10 MB | 500-1000 | +| Worker Cluster (50 nodes) | 2500 | 2-10 MB | 2500-5000 | + +**Memory Estimation per Agent:** +- Base object: ~50 KB +- Message history (100 messages): ~500 KB - 5 MB +- Tool records: ~50-500 KB +- Event timeline: ~100 KB - 1 MB +- **Typical total: 1-10 MB** + +--- + +## Summary + +1. **CLI/Desktop/IDE**: Use JSONStore, single AgentPool, straightforward +2. **Single Server**: Add HTTP layer, consider Redis for events +3. **Multi-node**: Implement custom Store, use queue for job distribution +4. **Serverless**: Use external DB, handle timeouts, consider background queue + +The key insight: **KODE SDK handles the agent lifecycle; you handle the infrastructure.** diff --git a/docs/ROADMAP.md b/docs/ROADMAP.md new file mode 100644 index 0000000..ba00487 --- /dev/null +++ b/docs/ROADMAP.md @@ -0,0 +1,303 @@ +# KODE SDK Roadmap + +> This document outlines the development roadmap for KODE SDK, based on actual current capabilities and planned enhancements. + +--- + +## Current State (v2.7.0) + +### What Works Well + +| Feature | Status | Notes | +|---------|--------|-------| +| Agent State Machine | Stable | 7-stage breakpoint system | +| JSONStore | Stable | WAL-protected file persistence | +| Event System | Stable | 3 channels (Progress/Control/Monitor) | +| Fork/Resume | Stable | Safe fork points, crash recovery | +| Multi-provider | Stable | Anthropic, OpenAI, Gemini, DeepSeek, Qwen, GLM... | +| Tool System | Stable | Built-in + MCP protocol | +| AgentPool | Stable | Up to 50 agents per process | +| Checkpointer | Stable | Memory, File, Redis implementations | +| Context Compression | Stable | Automatic history management | +| Hook System | Stable | Pre/post model and tool hooks | + +### Current Limitations + +| Limitation | Impact | Workaround | +|------------|--------|------------| +| JSONStore only | No database persistence | Implement custom Store | +| Single-process pool | No distributed scaling | Build orchestration layer | +| 5-min processing timeout | Not configurable | Fork SDK if needed | +| No stateless mode | Serverless challenges | Request-scoped pattern | +| No distributed locking | Multi-instance conflicts | External coordination | + +--- + +## Short-term: v2.8 - v2.9 (Q1-Q2 2025) + +### v2.8: Store Improvements + +**Goal**: Make custom Store implementation easier and more robust. + +| Feature | Priority | Description | +|---------|----------|-------------| +| Store interface documentation | P0 | Comprehensive guide for implementing custom stores | +| Store validation utilities | P1 | Test helpers to verify Store implementations | +| Incremental message API | P1 | `appendMessage()` in addition to `saveMessages()` | +| Store migration utilities | P2 | Tools for migrating data between Store implementations | + +**New APIs:** +```typescript +// Optional incremental methods (backwards compatible) +interface Store { + // Existing methods... + + // NEW: Incremental append (optional, for performance) + appendMessage?(agentId: string, message: Message): Promise; + + // NEW: Paginated loading (optional, for large histories) + loadMessagesPaginated?(agentId: string, opts: { + offset: number; + limit: number; + }): Promise; +} +``` + +### v2.9: Configurable Limits + +**Goal**: Remove hard-coded limits, improve serverless compatibility. + +| Feature | Priority | Description | +|---------|----------|-------------| +| Configurable processing timeout | P0 | Currently hard-coded to 5 minutes | +| Configurable tool buffer size | P1 | Currently hard-coded to 10 MB | +| Pool size validation | P2 | Better error messages when exceeding limits | + +**New APIs:** +```typescript +// Agent configuration +const agent = await Agent.create({ + agentId: 'my-agent', + templateId: 'default', + // NEW: Runtime limits + limits: { + processingTimeout: 30_000, // 30 seconds for serverless + toolBufferSize: 5 * 1024 * 1024, // 5 MB + }, +}, dependencies); +``` + +--- + +## Mid-term: v3.0 (Q3 2025) + +### v3.0: Official Store Implementations + +**Goal**: Provide production-ready Store implementations for common databases. + +| Store | Priority | Dependencies | +|-------|----------|--------------| +| `@kode-sdk/store-postgres` | P0 | `pg` | +| `@kode-sdk/store-redis` | P0 | `ioredis` | +| `@kode-sdk/store-supabase` | P1 | `@supabase/supabase-js` | +| `@kode-sdk/store-dynamodb` | P2 | `@aws-sdk/client-dynamodb` | + +**Package Structure:** +``` +@kode-sdk/store-postgres +├── src/ +│ ├── index.ts +│ ├── postgres-store.ts +│ └── schema.sql +├── README.md +└── package.json +``` + +**Features:** +- Complete Store interface implementation +- Schema migration utilities +- Connection pooling +- Retry logic for transient failures +- Distributed locking support + +### v3.0: Stateless Execution Mode + +**Goal**: Native support for serverless environments. + +```typescript +// NEW: Request-scoped execution +import { StatelessAgent } from '@shareai-lab/kode-sdk'; + +export async function POST(req: Request) { + const { agentId, message } = await req.json(); + + // Automatically handles: load → execute → persist + const result = await StatelessAgent.run(agentId, message, { + store: postgresStore, + templateId: 'default', + timeout: 25_000, + }); + + return Response.json(result); +} +``` + +**Key Features:** +- Automatic state loading and persisting +- Timeout handling with graceful shutdown +- No in-memory pool required +- Optimized for cold starts + +--- + +## Long-term: v4.0+ (2026) + +### v4.0: Distributed Infrastructure (Optional Package) + +**Goal**: Official distributed coordination package for high-scale deployments. + +``` +@kode-sdk/distributed +├── scheduler/ # Agent scheduling across workers +├── discovery/ # Agent location discovery +├── locking/ # Distributed locking +└── migration/ # Agent migration between nodes +``` + +**Features:** +```typescript +import { DistributedPool } from '@kode-sdk/distributed'; + +const pool = new DistributedPool({ + store: postgresStore, + redis: redisClient, + workerId: process.env.WORKER_ID, + maxLocalAgents: 50, +}); + +// Agent automatically migrates between workers +const agent = await pool.acquire(agentId); +await agent.send(message); +await agent.complete(); +await pool.release(agentId); +``` + +### v4.0: Observability Integration + +**Goal**: First-class observability support. + +```typescript +import { OpenTelemetryPlugin } from '@kode-sdk/observability'; + +const agent = await Agent.create({ + agentId: 'my-agent', + templateId: 'default', + plugins: [ + new OpenTelemetryPlugin({ + serviceName: 'my-agent-service', + tracing: true, + metrics: true, + }), + ], +}, dependencies); +``` + +**Metrics:** +- `agent.step.duration` - Time per agent step +- `agent.tool.duration` - Time per tool execution +- `agent.model.tokens` - Token usage +- `agent.errors` - Error count by type + +### v4.x: Additional Sandboxes + +| Sandbox | Status | Description | +|---------|--------|-------------| +| `DockerSandbox` | Planned | Run tools in Docker containers | +| `K8sSandbox` | Planned | Run tools in Kubernetes pods | +| `E2BSandbox` | Planned | Integration with E2B.dev | +| `FirecrackerSandbox` | Exploring | MicroVM isolation | + +--- + +## Community Contributions Welcome + +### High-Impact Contributions + +1. **Store Implementations** + - MongoDB Store + - SQLite Store (for embedded use) + - Turso/LibSQL Store + +2. **Sandbox Implementations** + - Docker Sandbox + - WebContainer Sandbox (browser) + +3. **Tool Integrations** + - Browser automation (Playwright) + - Database clients + - Cloud service SDKs + +### Contribution Guidelines + +See [CONTRIBUTING.md](./CONTRIBUTING.md) for: +- Code style and testing requirements +- Pull request process +- Interface implementation guidelines + +--- + +## Version Support Policy + +| Version | Status | Support Until | +|---------|--------|---------------| +| v2.7.x | Current | Active development | +| v2.6.x | Maintenance | 6 months after v2.8 | +| v2.5.x | End of Life | No longer supported | + +**Semver Policy:** +- Major (v3, v4): Breaking changes to core APIs +- Minor (v2.8, v2.9): New features, backwards compatible +- Patch (v2.7.1): Bug fixes only + +--- + +## Feedback & Prioritization + +Roadmap priorities are influenced by: + +1. **GitHub Issues**: Feature requests with most reactions +2. **Community Discussions**: Patterns emerging from usage +3. **Production Feedback**: Real-world deployment challenges + +To influence the roadmap: +- Open or upvote GitHub Issues +- Share your use case in Discussions +- Contribute implementations for planned features + +--- + +## Timeline Summary + +``` +2025 Q1-Q2: v2.8-v2.9 +├── Store interface improvements +├── Configurable limits +└── Better serverless support + +2025 Q3: v3.0 +├── Official Store packages (Postgres, Redis, Supabase) +├── Stateless execution mode +└── Improved documentation + +2026: v4.0+ +├── Distributed infrastructure package +├── Observability integration +└── Additional sandbox implementations +``` + +The roadmap focuses on: +1. **Making extension easier** (v2.8-2.9) +2. **Providing official implementations** (v3.0) +3. **Scaling infrastructure** (v4.0+) + +Core philosophy remains: **KODE SDK is a runtime kernel, not a platform.** Official packages extend capabilities without bloating the core. diff --git a/docs/scenarios/cli-tools.md b/docs/scenarios/cli-tools.md new file mode 100644 index 0000000..ac1113a --- /dev/null +++ b/docs/scenarios/cli-tools.md @@ -0,0 +1,352 @@ +# Scenario: CLI Agent Tools + +> Build command-line AI assistants like Claude Code, Cursor, or custom developer tools. + +--- + +## Why CLI is Perfect for KODE SDK + +| Feature | Benefit | +|---------|---------| +| Single process | No distributed complexity | +| Local filesystem | JSONStore works perfectly | +| Long-running | Agent loops run naturally | +| Single user | No multi-tenancy needed | + +**Compatibility: 100%** - This is KODE SDK's sweet spot. + +--- + +## Quick Start: Minimal CLI Agent + +```typescript +// cli-agent.ts +import { Agent, AnthropicProvider, LocalSandbox } from '@anthropic/kode-sdk'; +import * as readline from 'readline'; + +async function main() { + // Create agent with local persistence + const agent = await Agent.create({ + agentId: 'cli-assistant', + template: { + systemPrompt: `You are a helpful CLI assistant. +You can execute bash commands and help with file operations.`, + }, + deps: { + modelProvider: new AnthropicProvider(process.env.ANTHROPIC_API_KEY!), + sandbox: new LocalSandbox({ workDir: process.cwd() }), + }, + }); + + // Stream output to terminal + agent.subscribeProgress({ kinds: ['text_chunk'] }, (event) => { + process.stdout.write(event.text); + }); + + // Show tool execution + agent.subscribeProgress({ kinds: ['tool:start', 'tool:complete'] }, (event) => { + if (event.kind === 'tool:start') { + console.log(`\n[Running: ${event.name}]`); + } + }); + + // Interactive loop + const rl = readline.createInterface({ + input: process.stdin, + output: process.stdout, + }); + + console.log('CLI Agent ready. Type your message (Ctrl+C to exit)\n'); + + const askQuestion = () => { + rl.question('You: ', async (input) => { + if (input.trim()) { + console.log('\nAssistant: '); + await agent.chat(input); + console.log('\n'); + } + askQuestion(); + }); + }; + + askQuestion(); +} + +main().catch(console.error); +``` + +Run it: +```bash +npx ts-node cli-agent.ts +``` + +--- + +## Production CLI: Resume & Persistence + +For a production CLI tool, you want: +1. **Session persistence** - Resume conversations across runs +2. **Crash recovery** - Don't lose progress +3. **Multiple sessions** - Switch between contexts + +```typescript +// production-cli.ts +import { Agent, AgentPool, AnthropicProvider, LocalSandbox, JSONStore } from '@anthropic/kode-sdk'; +import * as path from 'path'; +import * as os from 'os'; + +// Store data in user's home directory +const DATA_DIR = path.join(os.homedir(), '.my-cli-agent'); +const store = new JSONStore(DATA_DIR); + +async function getOrCreateAgent(sessionId: string): Promise { + const pool = new AgentPool({ store, maxAgents: 10 }); + + // Try to resume existing session + try { + const agent = await pool.resume(sessionId, { + template: { systemPrompt: '...' }, + }, { + modelProvider: new AnthropicProvider(process.env.ANTHROPIC_API_KEY!), + sandbox: new LocalSandbox({ workDir: process.cwd() }), + }); + console.log(`Resumed session: ${sessionId}`); + return agent; + } catch { + // Create new session + const agent = await pool.create(sessionId, { + template: { systemPrompt: '...' }, + }, { + modelProvider: new AnthropicProvider(process.env.ANTHROPIC_API_KEY!), + sandbox: new LocalSandbox({ workDir: process.cwd() }), + }); + console.log(`Created new session: ${sessionId}`); + return agent; + } +} + +// Usage +const sessionId = process.argv[2] || 'default'; +const agent = await getOrCreateAgent(sessionId); +``` + +--- + +## Tool Approval Flow + +For dangerous operations, implement approval: + +```typescript +import { PermissionMode } from '@anthropic/kode-sdk'; + +const agent = await Agent.create({ + agentId: 'safe-cli', + config: { + permission: { + mode: 'approval', // Require approval for all tools + // Or custom mode: + // mode: 'custom', + // customMode: async (call, ctx) => { + // if (call.name === 'bash_run') { + // return { decision: 'ask' }; // Prompt user + // } + // return { decision: 'allow' }; + // } + }, + }, + // ... +}); + +// Handle approval requests +agent.subscribeControl((event) => { + if (event.kind === 'permission_required') { + console.log(`\nTool requires approval: ${event.toolName}`); + console.log(`Input: ${JSON.stringify(event.input, null, 2)}`); + + const rl = readline.createInterface({ + input: process.stdin, + output: process.stdout, + }); + + rl.question('Approve? (y/n): ', (answer) => { + agent.approveToolUse(event.callId, answer.toLowerCase() === 'y'); + rl.close(); + }); + } +}); +``` + +--- + +## Example: Developer Assistant CLI + +Complete example with file operations, git commands, and safety: + +```typescript +// dev-assistant.ts +import { + Agent, + AnthropicProvider, + LocalSandbox, + JSONStore, + defineSimpleTool, +} from '@anthropic/kode-sdk'; + +// Custom tools +const gitStatusTool = defineSimpleTool({ + name: 'git_status', + description: 'Check git repository status', + parameters: {}, + execute: async () => { + const { execSync } = await import('child_process'); + return execSync('git status --porcelain').toString(); + }, +}); + +const searchCodeTool = defineSimpleTool({ + name: 'search_code', + description: 'Search for patterns in code files', + parameters: { + type: 'object', + properties: { + pattern: { type: 'string', description: 'Search pattern (regex)' }, + fileType: { type: 'string', description: 'File extension (e.g., ts, js)' }, + }, + required: ['pattern'], + }, + execute: async ({ pattern, fileType }) => { + const { execSync } = await import('child_process'); + const glob = fileType ? `--include="*.${fileType}"` : ''; + return execSync(`grep -r ${glob} "${pattern}" . 2>/dev/null || echo "No matches"`).toString(); + }, +}); + +async function main() { + const agent = await Agent.create({ + agentId: 'dev-assistant', + template: { + systemPrompt: `You are a developer assistant. +You help with: +- Code navigation and search +- Git operations +- File management +- Running tests and builds + +Always explain what you're doing before executing commands. +Be cautious with destructive operations.`, + tools: [gitStatusTool, searchCodeTool], // Add custom tools + }, + config: { + permission: { + mode: 'auto', // Auto-approve safe operations + autoApprove: ['git_status', 'search_code', 'file_read'], + requireApproval: ['bash_run', 'file_write', 'file_delete'], + }, + }, + deps: { + modelProvider: new AnthropicProvider(process.env.ANTHROPIC_API_KEY!, 'claude-sonnet-4-20250514'), + sandbox: new LocalSandbox({ workDir: process.cwd() }), + store: new JSONStore('./.dev-assistant'), + }, + }); + + // ... rest of CLI implementation +} +``` + +--- + +## Best Practices for CLI Agents + +### 1. Progress Indication + +```typescript +// Show spinner during model calls +agent.subscribeMonitor((event) => { + if (event.kind === 'model_start') { + process.stdout.write('Thinking...'); + } + if (event.kind === 'model_complete') { + process.stdout.write('\r \r'); // Clear spinner + } +}); +``` + +### 2. Graceful Shutdown + +```typescript +// Handle Ctrl+C +process.on('SIGINT', async () => { + console.log('\nSaving session...'); + await agent.persistInfo(); + process.exit(0); +}); + +process.on('SIGTERM', async () => { + await agent.persistInfo(); + process.exit(0); +}); +``` + +### 3. Token Usage Tracking + +```typescript +let totalTokens = 0; + +agent.subscribeMonitor((event) => { + if (event.kind === 'token_usage') { + totalTokens += event.inputTokens + event.outputTokens; + // Show in status bar or on exit + } +}); + +process.on('exit', () => { + console.log(`\nTotal tokens used: ${totalTokens}`); +}); +``` + +### 4. History Navigation + +```typescript +// Show conversation history on start +const messages = await agent.getMessages(); +console.log(`Session has ${messages.length} messages`); + +// Allow user to see recent context +if (messages.length > 0) { + const last = messages[messages.length - 1]; + console.log(`Last message: ${last.role}: ${last.content.slice(0, 100)}...`); +} +``` + +--- + +## File Structure + +Recommended project structure for a CLI agent: + +``` +my-cli-agent/ +├── src/ +│ ├── index.ts # Entry point +│ ├── agent.ts # Agent configuration +│ ├── tools/ # Custom tools +│ │ ├── git.ts +│ │ ├── search.ts +│ │ └── index.ts +│ └── ui/ # Terminal UI +│ ├── spinner.ts +│ ├── prompt.ts +│ └── colors.ts +├── data/ # Agent persistence (gitignored) +├── package.json +└── tsconfig.json +``` + +--- + +## Next Steps + +- See [Tools Guide](../tools.md) for building custom tools +- See [Events Guide](../events.md) for advanced event handling +- See [Playbooks](../playbooks.md) for common patterns diff --git a/docs/scenarios/desktop-apps.md b/docs/scenarios/desktop-apps.md new file mode 100644 index 0000000..0c9e96b --- /dev/null +++ b/docs/scenarios/desktop-apps.md @@ -0,0 +1,234 @@ +# Scenario: Desktop Applications + +> Build Electron/Tauri apps with embedded AI agents. + +--- + +## Why Desktop is Perfect for KODE SDK + +| Feature | Benefit | +|---------|---------| +| Full filesystem access | JSONStore works natively | +| Long-running process | Agent loops run without timeout | +| Local resources | No network latency for persistence | +| Single user | No multi-tenancy complexity | + +**Compatibility: 95%** - Minor adjustments for IPC. + +--- + +## Electron Integration + +### Main Process Setup + +```typescript +// main/agent-service.ts +import { Agent, AgentPool, AnthropicProvider, LocalSandbox, JSONStore } from '@anthropic/kode-sdk'; +import { app, ipcMain } from 'electron'; +import * as path from 'path'; + +// Store data in app's user data directory +const DATA_DIR = path.join(app.getPath('userData'), 'agents'); +const store = new JSONStore(DATA_DIR); +const pool = new AgentPool({ store, maxAgents: 20 }); + +// Create agent +ipcMain.handle('agent:create', async (event, { agentId, systemPrompt }) => { + const agent = await pool.create(agentId, { + template: { systemPrompt }, + }, { + modelProvider: new AnthropicProvider(process.env.ANTHROPIC_API_KEY!), + sandbox: new LocalSandbox({ workDir: app.getPath('documents') }), + }); + + // Forward events to renderer + agent.subscribeProgress({ kinds: ['text_chunk', 'tool:start', 'tool:complete', 'done'] }, (event) => { + event.sender.send(`agent:progress:${agentId}`, event); + }); + + return { success: true, agentId }; +}); + +// Send message +ipcMain.handle('agent:chat', async (event, { agentId, message }) => { + const agent = pool.get(agentId); + if (!agent) throw new Error('Agent not found'); + + await agent.chat(message); + return { success: true }; +}); + +// List agents +ipcMain.handle('agent:list', async () => { + const agents = await store.listAgents(); + return agents; +}); + +// Graceful shutdown +app.on('before-quit', async (event) => { + event.preventDefault(); + for (const [id, agent] of pool.agents) { + await agent.persistInfo(); + } + app.quit(); +}); +``` + +### Renderer Process (React) + +```typescript +// renderer/hooks/useAgent.ts +import { useState, useEffect, useCallback } from 'react'; + +export function useAgent(agentId: string) { + const [messages, setMessages] = useState([]); + const [isProcessing, setIsProcessing] = useState(false); + + useEffect(() => { + // Listen for progress events + const handler = (event: any, data: ProgressEvent) => { + if (data.kind === 'text_chunk') { + setMessages(prev => { + const last = prev[prev.length - 1]; + if (last?.role === 'assistant') { + return [...prev.slice(0, -1), { + ...last, + content: last.content + data.text, + }]; + } + return [...prev, { role: 'assistant', content: data.text }]; + }); + } + if (data.kind === 'done') { + setIsProcessing(false); + } + }; + + window.electron.on(`agent:progress:${agentId}`, handler); + return () => window.electron.off(`agent:progress:${agentId}`, handler); + }, [agentId]); + + const sendMessage = useCallback(async (text: string) => { + setMessages(prev => [...prev, { role: 'user', content: text }]); + setIsProcessing(true); + await window.electron.invoke('agent:chat', { agentId, message: text }); + }, [agentId]); + + return { messages, isProcessing, sendMessage }; +} +``` + +--- + +## Tauri Integration + +```rust +// src-tauri/src/main.rs +use tauri::Manager; + +#[tauri::command] +async fn create_agent(app: tauri::AppHandle, agent_id: String) -> Result<(), String> { + // Use sidecar process for Node.js agent runtime + let sidecar = app.shell() + .sidecar("agent-runtime") + .expect("failed to create sidecar"); + + sidecar.spawn().expect("failed to spawn sidecar"); + Ok(()) +} +``` + +```typescript +// agent-runtime/index.ts (sidecar) +// Same KODE SDK code as Electron main process +// Communicate via Tauri's shell commands +``` + +--- + +## Best Practices + +### 1. Data Directory + +```typescript +// Cross-platform data directory +import { app } from 'electron'; + +const getDataDir = () => { + // macOS: ~/Library/Application Support/YourApp/agents + // Windows: %APPDATA%/YourApp/agents + // Linux: ~/.config/YourApp/agents + return path.join(app.getPath('userData'), 'agents'); +}; +``` + +### 2. Workspace Integration + +```typescript +// Let user choose workspace +const workspace = await dialog.showOpenDialog({ + properties: ['openDirectory'], + title: 'Select Agent Workspace', +}); + +const sandbox = new LocalSandbox({ + workDir: workspace.filePaths[0], + allowedPaths: [workspace.filePaths[0]], // Restrict to selected folder +}); +``` + +### 3. Auto-update Agents + +```typescript +// On app update, migrate agent data if needed +app.on('ready', async () => { + const version = app.getVersion(); + const lastVersion = store.get('lastVersion'); + + if (lastVersion !== version) { + await migrateAgentData(lastVersion, version); + store.set('lastVersion', version); + } +}); +``` + +--- + +## Example: AI Writing Assistant + +```typescript +// Complete desktop writing assistant +const writingAssistant = await Agent.create({ + agentId: 'writing-assistant', + template: { + systemPrompt: `You are a writing assistant embedded in a desktop app. +You help users write, edit, and improve their documents. +You can read and write files in the user's workspace.`, + tools: [ + // Custom tool to interact with the editor + defineSimpleTool({ + name: 'insert_text', + description: 'Insert text at cursor position in the editor', + parameters: { + type: 'object', + properties: { + text: { type: 'string' }, + position: { type: 'number' }, + }, + required: ['text'], + }, + execute: async ({ text, position }) => { + // Send to renderer via IPC + mainWindow.webContents.send('editor:insert', { text, position }); + return 'Text inserted'; + }, + }), + ], + }, + // ... +}); +``` + +--- + +See [CLI Tools Guide](./cli-tools.md) for more patterns that apply to desktop apps. diff --git a/docs/scenarios/ide-plugins.md b/docs/scenarios/ide-plugins.md new file mode 100644 index 0000000..d27719c --- /dev/null +++ b/docs/scenarios/ide-plugins.md @@ -0,0 +1,359 @@ +# Scenario: IDE Plugins + +> Build VSCode, JetBrains, or other IDE extensions with AI coding assistants. + +--- + +## Why IDE Plugins Work Well + +| Feature | Benefit | +|---------|---------| +| Extension host process | Long-running, like desktop | +| workspace.fs API | File operations available | +| Single user context | No multi-tenancy | +| Rich UI integration | WebView for chat, decorations for highlights | + +**Compatibility: 85%** - Requires workspace.fs integration. + +--- + +## VSCode Extension + +### Extension Activation + +```typescript +// src/extension.ts +import * as vscode from 'vscode'; +import { Agent, AgentPool, AnthropicProvider } from '@anthropic/kode-sdk'; +import { VSCodeSandbox } from './vscode-sandbox'; +import { VSCodeStore } from './vscode-store'; + +let pool: AgentPool; + +export async function activate(context: vscode.ExtensionContext) { + // Store in extension's global storage + const store = new VSCodeStore(context.globalStorageUri); + pool = new AgentPool({ store, maxAgents: 5 }); + + // Register commands + context.subscriptions.push( + vscode.commands.registerCommand('myExtension.startChat', startChat), + vscode.commands.registerCommand('myExtension.explainCode', explainCode), + vscode.commands.registerCommand('myExtension.refactor', refactorCode), + ); + + // Create chat webview panel provider + context.subscriptions.push( + vscode.window.registerWebviewViewProvider('myExtension.chatView', new ChatViewProvider(pool)) + ); +} + +export async function deactivate() { + // Save all agents before deactivation + for (const [id, agent] of pool.agents) { + await agent.persistInfo(); + } +} +``` + +### VSCode-Specific Sandbox + +```typescript +// src/vscode-sandbox.ts +import * as vscode from 'vscode'; +import { Sandbox, SandboxConfig } from '@anthropic/kode-sdk'; + +export class VSCodeSandbox implements Sandbox { + private workspaceFolder: vscode.WorkspaceFolder; + + constructor(workspaceFolder: vscode.WorkspaceFolder) { + this.workspaceFolder = workspaceFolder; + } + + async readFile(relativePath: string): Promise { + const uri = vscode.Uri.joinPath(this.workspaceFolder.uri, relativePath); + const content = await vscode.workspace.fs.readFile(uri); + return new TextDecoder().decode(content); + } + + async writeFile(relativePath: string, content: string): Promise { + const uri = vscode.Uri.joinPath(this.workspaceFolder.uri, relativePath); + await vscode.workspace.fs.writeFile(uri, new TextEncoder().encode(content)); + } + + async listFiles(pattern: string): Promise { + const files = await vscode.workspace.findFiles(pattern); + return files.map(f => vscode.workspace.asRelativePath(f)); + } + + async executeCommand(command: string): Promise<{ stdout: string; stderr: string }> { + // Use VSCode's terminal API for command execution + const terminal = vscode.window.createTerminal({ + name: 'Agent Command', + cwd: this.workspaceFolder.uri, + }); + + // Note: VSCode terminal doesn't return output directly + // Consider using child_process if extension has Node.js access + terminal.sendText(command); + + return { stdout: 'Command sent to terminal', stderr: '' }; + } +} +``` + +### VSCode-Specific Store + +```typescript +// src/vscode-store.ts +import * as vscode from 'vscode'; +import { Store, Message, AgentInfo } from '@anthropic/kode-sdk'; + +export class VSCodeStore implements Store { + constructor(private storageUri: vscode.Uri) {} + + private getAgentUri(agentId: string, file: string): vscode.Uri { + return vscode.Uri.joinPath(this.storageUri, agentId, file); + } + + async saveMessages(agentId: string, messages: Message[]): Promise { + const uri = this.getAgentUri(agentId, 'messages.json'); + const content = JSON.stringify(messages, null, 2); + await vscode.workspace.fs.writeFile(uri, new TextEncoder().encode(content)); + } + + async loadMessages(agentId: string): Promise { + try { + const uri = this.getAgentUri(agentId, 'messages.json'); + const content = await vscode.workspace.fs.readFile(uri); + return JSON.parse(new TextDecoder().decode(content)); + } catch { + return []; + } + } + + // ... implement other Store methods +} +``` + +### Chat WebView + +```typescript +// src/chat-view-provider.ts +import * as vscode from 'vscode'; + +export class ChatViewProvider implements vscode.WebviewViewProvider { + constructor(private pool: AgentPool) {} + + resolveWebviewView(webviewView: vscode.WebviewView) { + webviewView.webview.options = { enableScripts: true }; + webviewView.webview.html = this.getHtmlContent(); + + // Handle messages from webview + webviewView.webview.onDidReceiveMessage(async (message) => { + if (message.type === 'chat') { + const agent = await this.getOrCreateAgent(); + + // Stream responses to webview + agent.subscribeProgress({ kinds: ['text_chunk'] }, (event) => { + webviewView.webview.postMessage({ + type: 'text_chunk', + text: event.text, + }); + }); + + await agent.chat(message.text); + + webviewView.webview.postMessage({ type: 'done' }); + } + }); + } + + private getHtmlContent(): string { + return ` + + + + + +
+ + + + `; + } +} +``` + +--- + +## Context-Aware Coding Assistant + +```typescript +// Provide code context to the agent +async function explainCode() { + const editor = vscode.window.activeTextEditor; + if (!editor) return; + + const selection = editor.selection; + const selectedText = editor.document.getText(selection); + const fileName = editor.document.fileName; + const languageId = editor.document.languageId; + + const agent = await getOrCreateAgent(); + + // Include file context + const context = ` +File: ${fileName} +Language: ${languageId} + +Selected code: +\`\`\`${languageId} +${selectedText} +\`\`\` +`; + + await agent.chat(`Explain this code:\n${context}`); +} + +// Inline code actions +async function refactorCode() { + const editor = vscode.window.activeTextEditor; + if (!editor) return; + + const selection = editor.selection; + const selectedText = editor.document.getText(selection); + + const agent = await getOrCreateAgent(); + + // Custom tool to apply edits + agent.registerTool({ + name: 'apply_edit', + description: 'Replace the selected code with improved version', + parameters: { + type: 'object', + properties: { + newCode: { type: 'string', description: 'The refactored code' }, + }, + required: ['newCode'], + }, + execute: async ({ newCode }) => { + await editor.edit(editBuilder => { + editBuilder.replace(selection, newCode); + }); + return 'Code replaced successfully'; + }, + }); + + await agent.chat(`Refactor this code to be cleaner and more efficient:\n${selectedText}`); +} +``` + +--- + +## JetBrains Plugin (Kotlin) + +```kotlin +// For JetBrains, run KODE SDK as a sidecar Node.js process +// and communicate via JSON-RPC or WebSocket + +class AgentService(private val project: Project) { + private var process: Process? = null + + fun start() { + val nodeScript = PluginUtil.getPluginPath() + "/agent-runtime/index.js" + process = ProcessBuilder("node", nodeScript) + .directory(File(project.basePath)) + .start() + + // Read output + thread { + process?.inputStream?.bufferedReader()?.forEachLine { line -> + handleAgentOutput(line) + } + } + } + + fun sendMessage(message: String) { + process?.outputStream?.let { + it.write("$message\n".toByteArray()) + it.flush() + } + } +} +``` + +--- + +## Best Practices for IDE Plugins + +### 1. Workspace-Scoped Agents + +```typescript +// One agent per workspace +function getAgentId(workspaceFolder: vscode.WorkspaceFolder): string { + return `workspace-${hashString(workspaceFolder.uri.toString())}`; +} +``` + +### 2. Respect User Settings + +```typescript +const config = vscode.workspace.getConfiguration('myExtension'); +const apiKey = config.get('apiKey'); +const modelId = config.get('model') || 'claude-sonnet-4-20250514'; +``` + +### 3. Progress Indication + +```typescript +await vscode.window.withProgress({ + location: vscode.ProgressLocation.Notification, + title: 'AI Assistant', + cancellable: true, +}, async (progress, token) => { + token.onCancellationRequested(() => { + agent.stop(); + }); + + progress.report({ message: 'Thinking...' }); + await agent.chat(message); +}); +``` + +### 4. Diagnostic Integration + +```typescript +// Show agent suggestions as diagnostics +const diagnosticCollection = vscode.languages.createDiagnosticCollection('ai-suggestions'); + +agent.subscribeProgress({ kinds: ['suggestion'] }, (event) => { + const diagnostic = new vscode.Diagnostic( + new vscode.Range(event.line, 0, event.line, 100), + event.message, + vscode.DiagnosticSeverity.Information + ); + diagnosticCollection.set(editor.document.uri, [diagnostic]); +}); +``` + +--- + +See [Desktop Apps Guide](./desktop-apps.md) for more patterns that apply to IDE plugins. diff --git a/docs/scenarios/large-scale-toc.md b/docs/scenarios/large-scale-toc.md new file mode 100644 index 0000000..3470b5e --- /dev/null +++ b/docs/scenarios/large-scale-toc.md @@ -0,0 +1,749 @@ +# Scenario: Large-Scale ToC Applications + +> Build ChatGPT/Manus-like applications serving thousands of concurrent users with hundreds of agents each. + +--- + +## The Challenge + +Building a consumer-facing AI application at scale requires solving: + +| Challenge | Description | +|-----------|-------------| +| **High Concurrency** | 10K+ users, each with multiple agents | +| **Agent Hibernation** | Inactive agents must sleep to save resources | +| **Crash Recovery** | Server restart must restore all running agents | +| **Fork Exploration** | Users fork agents to explore different paths | +| **Multi-Machine** | Scale horizontally across servers | +| **Serverless Frontend** | Deploy UI on Vercel/Cloudflare | + +**Direct KODE SDK Usage: Not Suitable** + +KODE SDK is designed as a runtime kernel, not a distributed platform. For large-scale ToC, you need the **Worker Microservice Pattern**. + +--- + +## Recommended Architecture + +``` ++------------------------------------------------------------------+ +| User Devices | ++------------------------------------------------------------------+ + | + v ++------------------------------------------------------------------+ +| CDN / Edge (Cloudflare) | ++------------------------------------------------------------------+ + | + v ++------------------------------------------------------------------+ +| API Gateway (Vercel/Cloudflare) | +| | +| /api/agents - List user's agents | +| /api/agents/:id - Get agent status | +| /api/chat - Send message (enqueue) | +| /api/fork - Fork agent (enqueue) | +| /api/stream/:id - SSE stream (from Redis) | ++------------------------------------------------------------------+ + | + v ++------------------------------------------------------------------+ +| Message Queue (Upstash Redis) | +| | +| Queue: agent:messages - Chat messages | +| Queue: agent:commands - Fork, hibernate, resume | +| PubSub: agent:events:{id} - Real-time events | ++------------------------------------------------------------------+ + | + +----------------+----------------+ + | | | + v v v ++------------------+ +------------------+ +------------------+ +| Worker Pool 1 | | Worker Pool 2 | | Worker Pool N | +| (Railway) | | (Railway) | | (Railway) | +| | | | | | +| +----------+ | | +----------+ | | +----------+ | +| | KODE SDK | | | | KODE SDK | | | | KODE SDK | | +| | Scheduler| | | | Scheduler| | | | Scheduler| | +| +----------+ | | +----------+ | | +----------+ | +| | | | | | +| Agents: 0-999 | | Agents: 1K-2K | | Agents: 2K-3K | ++--------+---------+ +--------+---------+ +--------+---------+ + | | | + +---------------------+---------------------+ + | + v ++------------------------------------------------------------------+ +| Distributed Store | +| | +| PostgreSQL (Supabase) - Agent state, messages, metadata | +| Redis Cluster - Locks, sessions, hot cache | +| S3/R2 - File attachments, archives | ++------------------------------------------------------------------+ +``` + +--- + +## Component Implementation + +### 1. API Layer (Serverless) + +```typescript +// app/api/chat/route.ts (Next.js App Router) +import { NextRequest } from 'next/server'; +import { Redis } from '@upstash/redis'; +import { createClient } from '@supabase/supabase-js'; + +const redis = new Redis({ url: process.env.UPSTASH_URL!, token: process.env.UPSTASH_TOKEN! }); +const supabase = createClient(process.env.SUPABASE_URL!, process.env.SUPABASE_KEY!); + +export async function POST(req: NextRequest) { + // 1. Authenticate user + const user = await authenticateUser(req); + if (!user) { + return Response.json({ error: 'Unauthorized' }, { status: 401 }); + } + + // 2. Parse request + const { agentId, message } = await req.json(); + + // 3. Verify agent ownership + const { data: agent } = await supabase + .from('agents') + .select('id, user_id, state') + .eq('id', agentId) + .single(); + + if (!agent || agent.user_id !== user.id) { + return Response.json({ error: 'Agent not found' }, { status: 404 }); + } + + // 4. Create task and enqueue + const taskId = crypto.randomUUID(); + + await redis.lpush('agent:messages', JSON.stringify({ + taskId, + agentId, + userId: user.id, + message, + timestamp: Date.now(), + })); + + // 5. Update agent state + await supabase + .from('agents') + .update({ state: 'QUEUED', last_activity: new Date() }) + .eq('id', agentId); + + // 6. Return task ID for polling/streaming + return Response.json({ + taskId, + streamUrl: `/api/stream/${taskId}`, + }); +} +``` + +### 2. SSE Stream Endpoint + +```typescript +// app/api/stream/[taskId]/route.ts +import { Redis } from '@upstash/redis'; + +const redis = new Redis({ url: process.env.UPSTASH_URL!, token: process.env.UPSTASH_TOKEN! }); + +export async function GET( + req: NextRequest, + { params }: { params: { taskId: string } } +) { + const { taskId } = params; + + // Create SSE stream + const stream = new ReadableStream({ + async start(controller) { + const encoder = new TextEncoder(); + + // Subscribe to Redis PubSub + const subscriber = redis.duplicate(); + await subscriber.subscribe(`task:${taskId}:events`, (message) => { + const event = JSON.parse(message); + controller.enqueue(encoder.encode(`data: ${JSON.stringify(event)}\n\n`)); + + if (event.kind === 'done' || event.kind === 'error') { + controller.close(); + subscriber.unsubscribe(); + } + }); + + // Timeout after 5 minutes + setTimeout(() => { + controller.close(); + subscriber.unsubscribe(); + }, 5 * 60 * 1000); + }, + }); + + return new Response(stream, { + headers: { + 'Content-Type': 'text/event-stream', + 'Cache-Control': 'no-cache', + 'Connection': 'keep-alive', + }, + }); +} +``` + +### 3. Worker Service + +```typescript +// worker/index.ts +import { Agent, AgentPool } from '@anthropic/kode-sdk'; +import { Redis } from 'ioredis'; +import { PostgresStore } from './postgres-store'; +import { AgentScheduler } from './scheduler'; + +const redis = new Redis(process.env.REDIS_URL!); +const store = new PostgresStore(process.env.DATABASE_URL!); + +// Scheduler manages agent lifecycle +const scheduler = new AgentScheduler({ + maxActiveAgents: 100, // Per worker + idleTimeout: 5 * 60 * 1000, // 5 minutes + store, +}); + +// Process message queue +async function processMessages() { + while (true) { + // Blocking pop from queue + const result = await redis.brpop('agent:messages', 30); + + if (!result) continue; + + const task = JSON.parse(result[1]); + + try { + // Get or resume agent + const agent = await scheduler.getOrResume(task.agentId); + + // Subscribe to events and forward to Redis PubSub + agent.subscribeProgress({ kinds: ['text_chunk', 'tool:start', 'tool:complete', 'done'] }, async (event) => { + await redis.publish(`task:${task.taskId}:events`, JSON.stringify(event)); + }); + + // Process message + await agent.chat(task.message); + + // Publish completion + await redis.publish(`task:${task.taskId}:events`, JSON.stringify({ + kind: 'done', + taskId: task.taskId, + })); + + } catch (error) { + // Publish error + await redis.publish(`task:${task.taskId}:events`, JSON.stringify({ + kind: 'error', + taskId: task.taskId, + error: error.message, + })); + } + } +} + +// Start worker +processMessages().catch(console.error); +``` + +### 4. Agent Scheduler + +```typescript +// worker/scheduler.ts +import { Agent } from '@anthropic/kode-sdk'; +import { LRUCache } from 'lru-cache'; + +export class AgentScheduler { + private active: LRUCache; + private store: PostgresStore; + private config: SchedulerConfig; + + constructor(config: SchedulerConfig) { + this.config = config; + this.store = config.store; + + this.active = new LRUCache({ + max: config.maxActiveAgents, + dispose: (agent, agentId) => { + // Auto-hibernate when evicted + this.hibernate(agentId, agent); + }, + ttl: config.idleTimeout, + }); + } + + async getOrResume(agentId: string): Promise { + // Check active cache + if (this.active.has(agentId)) { + return this.active.get(agentId)!; + } + + // Acquire distributed lock + const lockId = await this.store.acquireLock(agentId, 60000); + if (!lockId) { + throw new Error('Agent is being processed by another worker'); + } + + try { + // Resume from database + const agent = await Agent.resume(agentId, this.getConfig(agentId), this.getDeps()); + + // Cache in active pool + this.active.set(agentId, agent); + + // Setup idle tracking + agent.onIdle(() => { + this.active.delete(agentId); // Triggers dispose -> hibernate + }); + + return agent; + + } finally { + await this.store.releaseLock(agentId, lockId); + } + } + + private async hibernate(agentId: string, agent: Agent): Promise { + try { + await agent.persistInfo(); + await this.store.updateAgentState(agentId, 'HIBERNATED'); + console.log(`Hibernated agent: ${agentId}`); + } catch (error) { + console.error(`Failed to hibernate ${agentId}:`, error); + } + } +} +``` + +### 5. PostgreSQL Store + +```typescript +// worker/postgres-store.ts +import { Pool } from 'pg'; +import { Store, Message, AgentInfo, ToolCallRecord } from '@anthropic/kode-sdk'; + +export class PostgresStore implements Store { + private pool: Pool; + + constructor(connectionString: string) { + this.pool = new Pool({ + connectionString, + max: 20, + idleTimeoutMillis: 30000, + }); + } + + // Distributed lock using PostgreSQL Advisory Locks + async acquireLock(agentId: string, ttlMs: number): Promise { + const lockKey = this.hashAgentId(agentId); + const client = await this.pool.connect(); + + try { + const result = await client.query( + 'SELECT pg_try_advisory_lock($1) as acquired', + [lockKey] + ); + + if (result.rows[0].acquired) { + const lockId = crypto.randomUUID(); + + // Set expiry using a separate table + await client.query( + `INSERT INTO agent_locks (agent_id, lock_id, expires_at) + VALUES ($1, $2, NOW() + interval '${ttlMs} milliseconds') + ON CONFLICT (agent_id) DO UPDATE SET lock_id = $2, expires_at = NOW() + interval '${ttlMs} milliseconds'`, + [agentId, lockId] + ); + + return lockId; + } + + return null; + } finally { + client.release(); + } + } + + async releaseLock(agentId: string, lockId: string): Promise { + const lockKey = this.hashAgentId(agentId); + const client = await this.pool.connect(); + + try { + // Verify lock ownership + const result = await client.query( + 'SELECT lock_id FROM agent_locks WHERE agent_id = $1', + [agentId] + ); + + if (result.rows[0]?.lock_id === lockId) { + await client.query('SELECT pg_advisory_unlock($1)', [lockKey]); + await client.query('DELETE FROM agent_locks WHERE agent_id = $1', [agentId]); + } + } finally { + client.release(); + } + } + + // Messages stored as JSONB + async saveMessages(agentId: string, messages: Message[]): Promise { + await this.pool.query( + `INSERT INTO agent_messages (agent_id, messages, updated_at) + VALUES ($1, $2, NOW()) + ON CONFLICT (agent_id) DO UPDATE SET messages = $2, updated_at = NOW()`, + [agentId, JSON.stringify(messages)] + ); + } + + async loadMessages(agentId: string): Promise { + const result = await this.pool.query( + 'SELECT messages FROM agent_messages WHERE agent_id = $1', + [agentId] + ); + return result.rows[0]?.messages || []; + } + + // ... implement other Store methods +} +``` + +--- + +## Database Schema + +```sql +-- Agents table +CREATE TABLE agents ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + user_id UUID NOT NULL REFERENCES users(id), + name TEXT NOT NULL, + template_id TEXT NOT NULL, + state TEXT DEFAULT 'READY', + config JSONB DEFAULT '{}', + created_at TIMESTAMPTZ DEFAULT NOW(), + updated_at TIMESTAMPTZ DEFAULT NOW(), + last_activity TIMESTAMPTZ DEFAULT NOW() +); + +CREATE INDEX idx_agents_user ON agents(user_id); +CREATE INDEX idx_agents_state ON agents(state); + +-- Messages (one row per agent, JSONB array) +CREATE TABLE agent_messages ( + agent_id UUID PRIMARY KEY REFERENCES agents(id) ON DELETE CASCADE, + messages JSONB NOT NULL DEFAULT '[]', + version INTEGER DEFAULT 1, + updated_at TIMESTAMPTZ DEFAULT NOW() +); + +-- Tool call records +CREATE TABLE tool_calls ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + agent_id UUID NOT NULL REFERENCES agents(id) ON DELETE CASCADE, + name TEXT NOT NULL, + input JSONB, + result JSONB, + state TEXT DEFAULT 'PENDING', + started_at TIMESTAMPTZ, + completed_at TIMESTAMPTZ +); + +CREATE INDEX idx_tool_calls_agent ON tool_calls(agent_id); + +-- Checkpoints (for fork) +CREATE TABLE checkpoints ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + agent_id UUID NOT NULL REFERENCES agents(id) ON DELETE CASCADE, + parent_checkpoint_id UUID REFERENCES checkpoints(id), + snapshot JSONB NOT NULL, + tags TEXT[] DEFAULT '{}', + created_at TIMESTAMPTZ DEFAULT NOW() +); + +CREATE INDEX idx_checkpoints_agent ON checkpoints(agent_id); + +-- Distributed locks +CREATE TABLE agent_locks ( + agent_id UUID PRIMARY KEY, + lock_id UUID NOT NULL, + worker_id TEXT, + expires_at TIMESTAMPTZ NOT NULL +); + +-- Row Level Security +ALTER TABLE agents ENABLE ROW LEVEL SECURITY; + +CREATE POLICY "Users can only access own agents" ON agents + FOR ALL USING (auth.uid() = user_id); + +ALTER TABLE agent_messages ENABLE ROW LEVEL SECURITY; + +CREATE POLICY "Users can only access own agent messages" ON agent_messages + FOR ALL USING ( + agent_id IN (SELECT id FROM agents WHERE user_id = auth.uid()) + ); +``` + +--- + +## Handling Special Scenarios + +### Agent Hibernation (Inactive Users) + +```typescript +// Cron job: Check for idle agents every 5 minutes +async function hibernateIdleAgents() { + const idleThreshold = new Date(Date.now() - 30 * 60 * 1000); // 30 minutes + + const { data: idleAgents } = await supabase + .from('agents') + .select('id') + .eq('state', 'ACTIVE') + .lt('last_activity', idleThreshold.toISOString()); + + for (const agent of idleAgents || []) { + await redis.lpush('agent:commands', JSON.stringify({ + command: 'hibernate', + agentId: agent.id, + })); + } +} +``` + +### Server Crash Recovery + +```typescript +// On worker startup +async function recoverFromCrash() { + // Find agents that were being processed by this worker + const { data: orphanedAgents } = await supabase + .from('agents') + .select('id') + .eq('state', 'PROCESSING') + .eq('worker_id', WORKER_ID); + + for (const agent of orphanedAgents || []) { + console.log(`Recovering agent: ${agent.id}`); + + // Resume with crash strategy + const recovered = await Agent.resume(agent.id, config, deps, { + strategy: 'crash', // Auto-seal incomplete tool calls + }); + + // Re-queue for processing + await redis.lpush('agent:messages', JSON.stringify({ + taskId: `recovery-${agent.id}`, + agentId: agent.id, + message: null, // No new message, just recover + isRecovery: true, + })); + } +} + +// Call on startup +recoverFromCrash(); +``` + +### Fork Multiple Agents + +```typescript +// API endpoint for forking +export async function POST(req: NextRequest) { + const { agentId, checkpointId, count = 1 } = await req.json(); + + // Validate: max 10 forks at once + if (count > 10) { + return Response.json({ error: 'Max 10 forks at once' }, { status: 400 }); + } + + const forkIds: string[] = []; + + for (let i = 0; i < count; i++) { + const forkId = `${agentId}-fork-${Date.now()}-${i}`; + + await redis.lpush('agent:commands', JSON.stringify({ + command: 'fork', + agentId, + checkpointId, + forkId, + })); + + forkIds.push(forkId); + } + + return Response.json({ forkIds }); +} + +// Worker handles fork command +async function handleForkCommand(command: ForkCommand) { + const parent = await scheduler.getOrResume(command.agentId); + + const forked = await parent.fork(command.checkpointId); + + // Store forked agent + await supabase.from('agents').insert({ + id: command.forkId, + user_id: parent.userId, + parent_agent_id: command.agentId, + // ... copy other fields + }); +} +``` + +### Membership Expiry + +```typescript +// Webhook from payment provider +export async function POST(req: NextRequest) { + const event = await req.json(); + + if (event.type === 'subscription.cancelled') { + const userId = event.data.user_id; + + // Pause all user's agents + await supabase + .from('agents') + .update({ state: 'MEMBERSHIP_PAUSED' }) + .eq('user_id', userId); + + // Hibernate any active agents + const { data: activeAgents } = await supabase + .from('agents') + .select('id') + .eq('user_id', userId) + .eq('state', 'ACTIVE'); + + for (const agent of activeAgents || []) { + await redis.lpush('agent:commands', JSON.stringify({ + command: 'hibernate', + agentId: agent.id, + reason: 'membership_expired', + })); + } + } + + if (event.type === 'subscription.renewed') { + const userId = event.data.user_id; + + // Unpause all agents + await supabase + .from('agents') + .update({ state: 'HIBERNATED' }) + .eq('user_id', userId) + .eq('state', 'MEMBERSHIP_PAUSED'); + } +} +``` + +--- + +## Performance Considerations + +### Message Storage Optimization + +```typescript +// Instead of storing full messages array +// Use append-only log with periodic compaction + +class OptimizedMessageStore { + async appendMessage(agentId: string, message: Message) { + // Append to log table (fast) + await this.pool.query( + `INSERT INTO message_log (agent_id, seq, message) + VALUES ($1, nextval('message_seq'), $2)`, + [agentId, JSON.stringify(message)] + ); + + // Increment message count + await this.pool.query( + `UPDATE agents SET message_count = message_count + 1 WHERE id = $1`, + [agentId] + ); + } + + async loadMessages(agentId: string, limit = 100): Promise { + // Load latest messages (pagination) + const result = await this.pool.query( + `SELECT message FROM message_log + WHERE agent_id = $1 + ORDER BY seq DESC + LIMIT $2`, + [agentId, limit] + ); + + return result.rows.reverse().map(r => r.message); + } +} +``` + +### Fork Optimization (Copy-on-Write) + +```typescript +// Fork without copying all messages +async function forkAgentCOW(agentId: string, checkpointId: string): Promise { + const forkId = generateForkId(); + + // Copy only metadata, reference same message log + await this.pool.query( + `INSERT INTO agents (id, user_id, template_id, config, fork_base_checkpoint_id) + SELECT $1, user_id, template_id, config, $2 + FROM agents WHERE id = $3`, + [forkId, checkpointId, agentId] + ); + + // New messages go to fork's own log + // Old messages read from checkpoint reference + + return forkId; +} +``` + +--- + +## Deployment Checklist + +- [ ] API layer deployed to Vercel/Cloudflare +- [ ] Workers deployed to Railway/Render/Fly.io +- [ ] PostgreSQL (Supabase) configured with RLS +- [ ] Redis (Upstash) for queues and pub/sub +- [ ] S3/R2 for file attachments +- [ ] Monitoring (Sentry, DataDog, etc.) +- [ ] Rate limiting configured +- [ ] Graceful shutdown handlers +- [ ] Health check endpoints +- [ ] Auto-scaling rules for workers + +--- + +## Cost Estimation + +| Component | ~10K Users | ~100K Users | +|-----------|------------|-------------| +| Vercel (API) | $20/mo | $100/mo | +| Railway (Workers) | $50/mo | $500/mo | +| Supabase (PostgreSQL) | $25/mo | $100/mo | +| Upstash (Redis) | $10/mo | $50/mo | +| **Total** | **~$100/mo** | **~$750/mo** | + +*Excludes LLM API costs* + +--- + +## Summary + +Building a large-scale ToC application with KODE SDK requires: + +1. **Separate concerns**: Stateless API + Stateful workers +2. **Queue-based communication**: Decouple request handling from agent execution +3. **Distributed store**: PostgreSQL for persistence, Redis for real-time +4. **Agent scheduling**: LRU cache for active agents, hibernate inactive +5. **Crash recovery**: WAL + checkpoint for resilience + +KODE SDK provides the agent runtime kernel. You build the platform around it. diff --git a/docs/scenarios/web-backend.md b/docs/scenarios/web-backend.md new file mode 100644 index 0000000..3b81000 --- /dev/null +++ b/docs/scenarios/web-backend.md @@ -0,0 +1,444 @@ +# Scenario: Web Backend (Self-Hosted) + +> Deploy KODE SDK on your own servers for small to medium web applications. + +--- + +## When to Use This Pattern + +| Criteria | Threshold | +|----------|-----------| +| Concurrent users | < 1,000 | +| Concurrent agents | < 100 | +| Infrastructure | Single server / small cluster | +| Complexity | Moderate | + +**Compatibility: 80%** - Need to add HTTP layer and user isolation. + +--- + +## Architecture + +``` ++------------------------------------------------------------------+ +| Your Server | ++------------------------------------------------------------------+ +| | +| +------------------+ +------------------+ | +| | HTTP Layer | | KODE SDK | | +| | (Express/Hono) |---->| AgentPool | | +| +------------------+ +------------------+ | +| | | +| +------------------+ +------v------+ | +| | Auth Layer | | Store | | +| | (Passport/etc) | | (Redis/PG) | | +| +------------------+ +-------------+ | +| | ++------------------------------------------------------------------+ +``` + +--- + +## Express.js Integration + +```typescript +// server.ts +import express from 'express'; +import { Agent, AgentPool, AnthropicProvider, LocalSandbox } from '@anthropic/kode-sdk'; +import { RedisStore } from './redis-store'; // Custom implementation + +const app = express(); +const store = new RedisStore(process.env.REDIS_URL!); +const pool = new AgentPool({ store, maxAgents: 100 }); + +app.use(express.json()); + +// Middleware: Auth +app.use(async (req, res, next) => { + const token = req.headers.authorization?.replace('Bearer ', ''); + req.user = await verifyToken(token); + next(); +}); + +// Create agent for user +app.post('/api/agents', async (req, res) => { + const { name, systemPrompt } = req.body; + const agentId = `${req.user.id}-${Date.now()}`; + + const agent = await pool.create(agentId, { + template: { systemPrompt }, + config: { + metadata: { userId: req.user.id, name }, + }, + }, { + modelProvider: new AnthropicProvider(process.env.ANTHROPIC_API_KEY!), + sandbox: new LocalSandbox({ workDir: `/tmp/agents/${agentId}` }), + }); + + res.json({ agentId, name }); +}); + +// List user's agents +app.get('/api/agents', async (req, res) => { + const agents = await store.listAgentsByUser(req.user.id); + res.json(agents); +}); + +// Chat with agent +app.post('/api/agents/:agentId/chat', async (req, res) => { + const { agentId } = req.params; + const { message } = req.body; + + // Verify ownership + const info = await store.loadInfo(agentId); + if (info?.metadata?.userId !== req.user.id) { + return res.status(403).json({ error: 'Forbidden' }); + } + + // Get or resume agent + let agent = pool.get(agentId); + if (!agent) { + agent = await pool.resume(agentId, { + template: { systemPrompt: info.systemPrompt }, + }, { + modelProvider: new AnthropicProvider(process.env.ANTHROPIC_API_KEY!), + sandbox: new LocalSandbox({ workDir: `/tmp/agents/${agentId}` }), + }); + } + + // Stream response via SSE + res.setHeader('Content-Type', 'text/event-stream'); + res.setHeader('Cache-Control', 'no-cache'); + res.setHeader('Connection', 'keep-alive'); + + agent.subscribeProgress({ kinds: ['text_chunk', 'tool:start', 'tool:complete', 'done'] }, (event) => { + res.write(`data: ${JSON.stringify(event)}\n\n`); + + if (event.kind === 'done') { + res.end(); + } + }); + + await agent.chat(message); +}); + +// Graceful shutdown +process.on('SIGTERM', async () => { + console.log('Shutting down...'); + for (const [id, agent] of pool.agents) { + await agent.persistInfo(); + } + process.exit(0); +}); + +app.listen(3000, () => console.log('Server running on :3000')); +``` + +--- + +## Hono (Edge-Compatible) + +```typescript +// server.ts +import { Hono } from 'hono'; +import { serve } from '@hono/node-server'; +import { streamSSE } from 'hono/streaming'; + +const app = new Hono(); + +app.post('/api/agents/:agentId/chat', async (c) => { + const agentId = c.req.param('agentId'); + const { message } = await c.req.json(); + + const agent = await getOrResumeAgent(agentId); + + return streamSSE(c, async (stream) => { + agent.subscribeProgress({ kinds: ['text_chunk', 'done'] }, async (event) => { + await stream.writeSSE({ data: JSON.stringify(event) }); + + if (event.kind === 'done') { + await stream.close(); + } + }); + + await agent.chat(message); + }); +}); + +serve(app, { port: 3000 }); +``` + +--- + +## User Isolation + +### Per-User Agent Namespace + +```typescript +// Prefix all agent IDs with user ID +function getAgentId(userId: string, localId: string): string { + return `user:${userId}:agent:${localId}`; +} + +// List only user's agents +async function listUserAgents(userId: string): Promise { + const allAgents = await store.listAgents(); + return allAgents.filter(a => a.metadata?.userId === userId); +} +``` + +### Per-User Sandbox Isolation + +```typescript +// Each user gets isolated workspace +function getUserSandbox(userId: string, agentId: string): LocalSandbox { + const workDir = path.join('/data/workspaces', userId, agentId); + + return new LocalSandbox({ + workDir, + allowedPaths: [workDir], // Restrict to user's directory only + env: { + USER_ID: userId, + AGENT_ID: agentId, + }, + }); +} +``` + +--- + +## Redis Store Implementation + +```typescript +// redis-store.ts +import Redis from 'ioredis'; +import { Store, Message, AgentInfo, ToolCallRecord } from '@anthropic/kode-sdk'; + +export class RedisStore implements Store { + private redis: Redis; + + constructor(url: string) { + this.redis = new Redis(url); + } + + async saveMessages(agentId: string, messages: Message[]): Promise { + await this.redis.set(`agent:${agentId}:messages`, JSON.stringify(messages)); + } + + async loadMessages(agentId: string): Promise { + const data = await this.redis.get(`agent:${agentId}:messages`); + return data ? JSON.parse(data) : []; + } + + async saveInfo(agentId: string, info: AgentInfo): Promise { + await this.redis.set(`agent:${agentId}:info`, JSON.stringify(info)); + // Add to user's agent list + if (info.metadata?.userId) { + await this.redis.sadd(`user:${info.metadata.userId}:agents`, agentId); + } + } + + async loadInfo(agentId: string): Promise { + const data = await this.redis.get(`agent:${agentId}:info`); + return data ? JSON.parse(data) : undefined; + } + + async listAgentsByUser(userId: string): Promise { + const agentIds = await this.redis.smembers(`user:${userId}:agents`); + const infos = await Promise.all(agentIds.map(id => this.loadInfo(id))); + return infos.filter(Boolean) as AgentInfo[]; + } + + async deleteAgent(agentId: string): Promise { + const info = await this.loadInfo(agentId); + if (info?.metadata?.userId) { + await this.redis.srem(`user:${info.metadata.userId}:agents`, agentId); + } + await this.redis.del( + `agent:${agentId}:messages`, + `agent:${agentId}:info`, + `agent:${agentId}:tools`, + ); + } +} +``` + +--- + +## Rate Limiting + +```typescript +import rateLimit from 'express-rate-limit'; + +// Global rate limit +app.use(rateLimit({ + windowMs: 60 * 1000, // 1 minute + max: 60, // 60 requests per minute +})); + +// Per-user rate limit for chat +const chatLimiter = rateLimit({ + windowMs: 60 * 1000, + max: 20, // 20 chat requests per minute per user + keyGenerator: (req) => req.user?.id || req.ip, +}); + +app.post('/api/agents/:agentId/chat', chatLimiter, async (req, res) => { + // ... +}); + +// Token-based rate limiting +const tokenTracker = new Map(); + +agent.subscribeMonitor((event) => { + if (event.kind === 'token_usage') { + const userId = agent.metadata?.userId; + const current = tokenTracker.get(userId) || 0; + tokenTracker.set(userId, current + event.totalTokens); + + if (current + event.totalTokens > DAILY_TOKEN_LIMIT) { + agent.stop(); + throw new Error('Daily token limit exceeded'); + } + } +}); +``` + +--- + +## Health Checks + +```typescript +// Health check endpoint +app.get('/health', async (req, res) => { + const checks = { + redis: await checkRedis(), + agents: pool.agents.size, + memory: process.memoryUsage(), + }; + + const healthy = checks.redis; + res.status(healthy ? 200 : 503).json(checks); +}); + +async function checkRedis(): Promise { + try { + await redis.ping(); + return true; + } catch { + return false; + } +} + +// Kubernetes readiness probe +app.get('/ready', (req, res) => { + res.sendStatus(200); +}); + +// Kubernetes liveness probe +app.get('/live', (req, res) => { + res.sendStatus(200); +}); +``` + +--- + +## Docker Deployment + +```dockerfile +# Dockerfile +FROM node:20-alpine + +WORKDIR /app + +COPY package*.json ./ +RUN npm ci --production + +COPY dist ./dist + +ENV NODE_ENV=production + +EXPOSE 3000 + +CMD ["node", "dist/server.js"] +``` + +```yaml +# docker-compose.yml +version: '3.8' + +services: + api: + build: . + ports: + - "3000:3000" + environment: + - REDIS_URL=redis://redis:6379 + - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY} + depends_on: + - redis + deploy: + replicas: 2 + restart_policy: + condition: on-failure + + redis: + image: redis:7-alpine + volumes: + - redis-data:/data + +volumes: + redis-data: +``` + +--- + +## Scaling to Multiple Instances + +When running multiple server instances: + +### 1. Use Redis for Session Affinity + +```typescript +// Store which server handles which agent +await redis.set(`agent:${agentId}:server`, SERVER_ID, 'EX', 3600); + +// Check before resuming +const currentServer = await redis.get(`agent:${agentId}:server`); +if (currentServer && currentServer !== SERVER_ID) { + // Agent is on another server, redirect or wait +} +``` + +### 2. Distributed Locking + +```typescript +import Redlock from 'redlock'; + +const redlock = new Redlock([redis]); + +app.post('/api/agents/:agentId/chat', async (req, res) => { + const lock = await redlock.acquire([`lock:agent:${agentId}`], 30000); + + try { + // Only one server can process this agent at a time + const agent = await getOrResumeAgent(agentId); + await agent.chat(message); + } finally { + await lock.release(); + } +}); +``` + +--- + +## Migration Path to Large Scale + +When you outgrow single-server deployment: + +1. **Add message queue** - Decouple API from processing +2. **Separate workers** - Run agents in dedicated processes +3. **Use PostgreSQL** - Replace Redis for primary storage +4. **Add agent scheduler** - Manage agent lifecycle + +See [Large-Scale ToC Guide](./large-scale-toc.md) for the full architecture. diff --git a/package-lock.json b/package-lock.json index 3e9b25b..1d8a384 100644 --- a/package-lock.json +++ b/package-lock.json @@ -108,7 +108,6 @@ "resolved": "https://registry.npmjs.org/zod/-/zod-3.25.76.tgz", "integrity": "sha512-gzUt/qt81nXsFGKIFcC3YnfEAx5NkunCfnDlvuBSSFS02bcXu4Lmea0AFIUwbLWxWPx3d9p8S5QoaujKcNQxcQ==", "license": "MIT", - "peer": true, "funding": { "url": "https://github.com/sponsors/colinhacks" } @@ -201,7 +200,6 @@ "integrity": "sha512-N2clP5pJhB2YnZJ3PIHFk5RkygRX5WO/5f0WC08tp0wd+sv0rsJk3MqWn3CbNmT2J505a5336jaQj4ph1AdMug==", "dev": true, "license": "MIT", - "peer": true, "dependencies": { "undici-types": "~6.21.0" } @@ -728,7 +726,6 @@ "resolved": "https://registry.npmjs.org/express/-/express-5.2.1.tgz", "integrity": "sha512-hIS4idWWai69NezIdRt2xFVofaF4j+6INOpJlVOLDO8zXGpUVEVzIYk12UUi2JzjEzWL3IOAxcTubgz9Po0yXw==", "license": "MIT", - "peer": true, "dependencies": { "accepts": "^2.0.0", "body-parser": "^2.2.1", @@ -1336,7 +1333,6 @@ "resolved": "https://registry.npmmirror.com/pg/-/pg-8.17.2.tgz", "integrity": "sha512-vjbKdiBJRqzcYw1fNU5KuHyYvdJ1qpcQg1CeBrHFqV1pWgHeVR6j/+kX0E1AAXfyuLUGY1ICrN2ELKA/z2HWzw==", "license": "MIT", - "peer": true, "dependencies": { "pg-connection-string": "^2.10.1", "pg-pool": "^3.11.0", @@ -2064,7 +2060,6 @@ "integrity": "sha512-jl1vZzPDinLr9eUt3J/t7V6FgNEw9QjvBPdysz9KfQDD41fQrC2Y4vKQdiaUpFT4bXlb1RHhLpp8wtm6M5TgSw==", "dev": true, "license": "Apache-2.0", - "peer": true, "bin": { "tsc": "bin/tsc", "tsserver": "bin/tsserver" @@ -2165,7 +2160,6 @@ "resolved": "https://registry.npmjs.org/zod/-/zod-3.23.8.tgz", "integrity": "sha512-XBx9AXhXktjUqnepgTiE5flcKIYWi/rme0Eaj+5Y0lftuGBq+jyRu/md4WnuxqgP1ubdpNCsYEYPxrzVHD8d6g==", "license": "MIT", - "peer": true, "funding": { "url": "https://github.com/sponsors/colinhacks" } diff --git a/src/core/pool.ts b/src/core/pool.ts index eb4cfc2..d2ee827 100644 --- a/src/core/pool.ts +++ b/src/core/pool.ts @@ -7,6 +7,33 @@ export interface AgentPoolOptions { maxAgents?: number; } +export interface GracefulShutdownOptions { + /** Maximum time to wait for agents to complete current step (ms), default 30000 */ + timeout?: number; + /** Save running agents list for resumeFromShutdown(), default true */ + saveRunningList?: boolean; + /** Force interrupt agents that don't complete within timeout, default true */ + forceInterrupt?: boolean; +} + +export interface ShutdownResult { + /** Agents that completed gracefully */ + completed: string[]; + /** Agents that were interrupted due to timeout */ + interrupted: string[]; + /** Agents that failed to save state */ + failed: string[]; + /** Total shutdown time in ms */ + durationMs: number; +} + +/** Running agents metadata for recovery */ +interface RunningAgentsMeta { + agentIds: string[]; + shutdownAt: string; + version: string; +} + export class AgentPool { private agents = new Map(); private deps: AgentDependencies; @@ -111,4 +138,241 @@ export class AgentPool { size(): number { return this.agents.size; } + + /** + * Gracefully shutdown all agents in the pool + * 1. Stop accepting new operations + * 2. Wait for running agents to complete current step + * 3. Persist all agent states + * 4. Optionally save running agents list for recovery + */ + async gracefulShutdown(opts?: GracefulShutdownOptions): Promise { + const startTime = Date.now(); + const timeout = opts?.timeout ?? 30000; + const saveRunningList = opts?.saveRunningList ?? true; + const forceInterrupt = opts?.forceInterrupt ?? true; + + const result: ShutdownResult = { + completed: [], + interrupted: [], + failed: [], + durationMs: 0, + }; + + const agentIds = Array.from(this.agents.keys()); + logger.info(`[AgentPool] Starting graceful shutdown for ${agentIds.length} agents`); + + // Group agents by state + const workingAgents: Array<{ id: string; agent: Agent }> = []; + const readyAgents: Array<{ id: string; agent: Agent }> = []; + + for (const [id, agent] of this.agents) { + const status = await agent.status(); + if (status.state === 'WORKING') { + workingAgents.push({ id, agent }); + } else { + readyAgents.push({ id, agent }); + } + } + + // 1. Persist ready agents immediately + for (const { id, agent } of readyAgents) { + try { + await this.persistAgentState(agent); + result.completed.push(id); + } catch (error) { + logger.error(`[AgentPool] Failed to persist agent ${id}:`, error); + result.failed.push(id); + } + } + + // 2. Wait for working agents with timeout + if (workingAgents.length > 0) { + logger.info(`[AgentPool] Waiting for ${workingAgents.length} working agents...`); + + const waitPromises = workingAgents.map(async ({ id, agent }) => { + try { + const completed = await this.waitForAgentReady(agent, timeout); + if (completed) { + await this.persistAgentState(agent); + return { id, status: 'completed' as const }; + } else if (forceInterrupt) { + await agent.interrupt({ note: 'Graceful shutdown timeout' }); + await this.persistAgentState(agent); + return { id, status: 'interrupted' as const }; + } else { + return { id, status: 'interrupted' as const }; + } + } catch (error) { + logger.error(`[AgentPool] Error during shutdown for agent ${id}:`, error); + return { id, status: 'failed' as const }; + } + }); + + const results = await Promise.all(waitPromises); + for (const { id, status } of results) { + if (status === 'completed') { + result.completed.push(id); + } else if (status === 'interrupted') { + result.interrupted.push(id); + } else { + result.failed.push(id); + } + } + } + + // 3. Save running agents list for recovery + if (saveRunningList) { + try { + await this.saveRunningAgentsList(agentIds); + logger.info(`[AgentPool] Saved running agents list: ${agentIds.length} agents`); + } catch (error) { + logger.error(`[AgentPool] Failed to save running agents list:`, error); + } + } + + result.durationMs = Date.now() - startTime; + logger.info(`[AgentPool] Graceful shutdown completed in ${result.durationMs}ms`, { + completed: result.completed.length, + interrupted: result.interrupted.length, + failed: result.failed.length, + }); + + return result; + } + + /** + * Resume agents from a previous graceful shutdown + * Reads the running agents list and resumes each agent + */ + async resumeFromShutdown( + configFactory: (agentId: string) => AgentConfig, + opts?: { autoRun?: boolean; strategy?: 'crash' | 'manual' } + ): Promise { + const runningList = await this.loadRunningAgentsList(); + if (!runningList || runningList.length === 0) { + logger.info('[AgentPool] No running agents list found, nothing to resume'); + return []; + } + + logger.info(`[AgentPool] Resuming ${runningList.length} agents from shutdown`); + + const resumed: Agent[] = []; + for (const agentId of runningList) { + if (this.agents.size >= this.maxAgents) { + logger.warn(`[AgentPool] Pool is full, cannot resume more agents`); + break; + } + + try { + const config = configFactory(agentId); + const agent = await this.resume(agentId, config, { + autoRun: opts?.autoRun ?? false, + strategy: opts?.strategy ?? 'crash', + }); + resumed.push(agent); + } catch (error) { + logger.error(`[AgentPool] Failed to resume agent ${agentId}:`, error); + } + } + + // Clear the running agents list after successful resume + await this.clearRunningAgentsList(); + + logger.info(`[AgentPool] Resumed ${resumed.length}/${runningList.length} agents`); + return resumed; + } + + /** + * Register signal handlers for graceful shutdown + * Call this in your server setup code + */ + registerShutdownHandlers( + configFactory?: (agentId: string) => AgentConfig, + opts?: GracefulShutdownOptions + ): void { + const handler = async (signal: string) => { + logger.info(`[AgentPool] Received ${signal}, initiating graceful shutdown...`); + try { + const result = await this.gracefulShutdown(opts); + logger.info(`[AgentPool] Shutdown complete:`, result); + process.exit(0); + } catch (error) { + logger.error(`[AgentPool] Shutdown failed:`, error); + process.exit(1); + } + }; + + process.on('SIGTERM', () => handler('SIGTERM')); + process.on('SIGINT', () => handler('SIGINT')); + logger.info('[AgentPool] Shutdown handlers registered for SIGTERM and SIGINT'); + } + + // ========== Private Helper Methods ========== + + private async waitForAgentReady(agent: Agent, timeout: number): Promise { + const startTime = Date.now(); + const pollInterval = 100; // ms + + while (Date.now() - startTime < timeout) { + const status = await agent.status(); + if (status.state !== 'WORKING') { + return true; + } + await this.sleep(pollInterval); + } + + return false; + } + + private async persistAgentState(agent: Agent): Promise { + // Agent's internal persist methods are private, so we rely on the fact that + // state is automatically persisted during normal operation. + // This is a no-op placeholder for potential future explicit persist calls. + // The agent's state is already persisted via WAL mechanism. + } + + private async saveRunningAgentsList(agentIds: string[]): Promise { + const meta: RunningAgentsMeta = { + agentIds, + shutdownAt: new Date().toISOString(), + version: '1.0.0', + }; + + // Use the store's saveInfo to persist to a special key + // We use a well-known agent ID prefix for pool metadata + const poolMetaId = '__pool_meta__'; + await this.deps.store.saveInfo(poolMetaId, { + agentId: poolMetaId, + templateId: '__pool_meta__', + createdAt: new Date().toISOString(), + runningAgents: meta, + } as any); + } + + private async loadRunningAgentsList(): Promise { + const poolMetaId = '__pool_meta__'; + try { + const info = await this.deps.store.loadInfo(poolMetaId); + if (info && (info as any).runningAgents) { + return (info as any).runningAgents.agentIds; + } + } catch { + // Ignore errors, return null + } + return null; + } + + private async clearRunningAgentsList(): Promise { + const poolMetaId = '__pool_meta__'; + try { + await this.deps.store.delete(poolMetaId); + } catch { + // Ignore errors + } + } + + private sleep(ms: number): Promise { + return new Promise((resolve) => setTimeout(resolve, ms)); + } } diff --git a/src/index.ts b/src/index.ts index 0ae3132..7e1bb3c 100644 --- a/src/index.ts +++ b/src/index.ts @@ -8,7 +8,7 @@ export { SubscribeOptions, SendOptions, } from './core/agent'; -export { AgentPool } from './core/pool'; +export { AgentPool, GracefulShutdownOptions, ShutdownResult } from './core/pool'; export { Room } from './core/room'; export { Scheduler, AgentSchedulerHandle } from './core/scheduler'; export { EventBus } from './core/events'; diff --git a/tests/unit/core/pool-shutdown.test.ts b/tests/unit/core/pool-shutdown.test.ts new file mode 100644 index 0000000..fd04c66 --- /dev/null +++ b/tests/unit/core/pool-shutdown.test.ts @@ -0,0 +1,163 @@ +/** + * Tests for AgentPool graceful shutdown functionality + */ +import { describe, it, expect, beforeEach, afterEach, vi } from 'vitest'; +import { AgentPool, GracefulShutdownOptions, ShutdownResult } from '../../../src/core/pool'; +import { Agent } from '../../../src/core/agent'; +import { JSONStore } from '../../../src/infra/store/json-store'; +import * as fs from 'fs'; +import * as path from 'path'; +import * as os from 'os'; + +describe('AgentPool Graceful Shutdown', () => { + let pool: AgentPool; + let store: JSONStore; + let testDir: string; + + beforeEach(async () => { + testDir = path.join(os.tmpdir(), `kode-pool-test-${Date.now()}`); + fs.mkdirSync(testDir, { recursive: true }); + store = new JSONStore(testDir); + + // Mock dependencies + const mockProvider = { + chat: vi.fn().mockResolvedValue({ + role: 'assistant', + content: [{ type: 'text', text: 'Hello!' }], + }), + stream: vi.fn(), + }; + + pool = new AgentPool({ + dependencies: { + store, + modelProvider: mockProvider as any, + sandbox: { run: vi.fn() } as any, + }, + maxAgents: 10, + }); + }); + + afterEach(async () => { + try { + fs.rmSync(testDir, { recursive: true, force: true }); + } catch { + // Ignore cleanup errors + } + }); + + describe('gracefulShutdown', () => { + it('should return empty result when pool is empty', async () => { + const result = await pool.gracefulShutdown(); + + expect(result.completed).toEqual([]); + expect(result.interrupted).toEqual([]); + expect(result.failed).toEqual([]); + expect(result.durationMs).toBeGreaterThanOrEqual(0); + }); + + it('should save running agents list when saveRunningList is true', async () => { + // Create and add an agent to the pool + const mockAgent = { + status: vi.fn().mockResolvedValue({ state: 'READY' }), + interrupt: vi.fn(), + } as unknown as Agent; + + (pool as any).agents.set('test-agent-1', mockAgent); + + const result = await pool.gracefulShutdown({ saveRunningList: true }); + + expect(result.completed).toContain('test-agent-1'); + + // Verify running agents list was saved + const savedInfo = await store.loadInfo('__pool_meta__'); + expect(savedInfo).toBeDefined(); + expect((savedInfo as any).runningAgents.agentIds).toContain('test-agent-1'); + }); + + it('should not save running agents list when saveRunningList is false', async () => { + const mockAgent = { + status: vi.fn().mockResolvedValue({ state: 'READY' }), + interrupt: vi.fn(), + } as unknown as Agent; + + (pool as any).agents.set('test-agent-2', mockAgent); + + await pool.gracefulShutdown({ saveRunningList: false }); + + // Verify running agents list was NOT saved + const savedInfo = await store.loadInfo('__pool_meta__'); + expect(savedInfo).toBeUndefined(); + }); + + it('should interrupt working agents after timeout', async () => { + const interruptMock = vi.fn().mockResolvedValue(undefined); + const mockAgent = { + status: vi.fn().mockResolvedValue({ state: 'WORKING' }), + interrupt: interruptMock, + } as unknown as Agent; + + (pool as any).agents.set('working-agent', mockAgent); + + const result = await pool.gracefulShutdown({ + timeout: 100, // Very short timeout + forceInterrupt: true, + }); + + expect(interruptMock).toHaveBeenCalledWith({ note: 'Graceful shutdown timeout' }); + expect(result.interrupted).toContain('working-agent'); + }); + }); + + describe('resumeFromShutdown', () => { + it('should return empty array when no running agents list exists', async () => { + const configFactory = (agentId: string) => ({ + agentId, + template: { systemPrompt: 'test' }, + }); + + const resumed = await pool.resumeFromShutdown(configFactory); + + expect(resumed).toEqual([]); + }); + + it('should clear running agents list after resume', async () => { + // Manually save a running agents list + await store.saveInfo('__pool_meta__', { + agentId: '__pool_meta__', + templateId: '__pool_meta__', + createdAt: new Date().toISOString(), + runningAgents: { + agentIds: ['non-existent-agent'], + shutdownAt: new Date().toISOString(), + version: '1.0.0', + }, + } as any); + + const configFactory = (agentId: string) => ({ + agentId, + template: { systemPrompt: 'test' }, + }); + + // Resume will fail for non-existent agent, but should still clear the list + await pool.resumeFromShutdown(configFactory); + + // Verify the list was cleared + const savedInfo = await store.loadInfo('__pool_meta__'); + expect(savedInfo).toBeUndefined(); + }); + }); + + describe('registerShutdownHandlers', () => { + it('should register SIGTERM and SIGINT handlers', () => { + const onSpy = vi.spyOn(process, 'on'); + + pool.registerShutdownHandlers(); + + expect(onSpy).toHaveBeenCalledWith('SIGTERM', expect.any(Function)); + expect(onSpy).toHaveBeenCalledWith('SIGINT', expect.any(Function)); + + onSpy.mockRestore(); + }); + }); +});