A production-ready LLM orchestration service built with NestJS that provides intelligent multi-agent reasoning, tool execution, RAG capabilities, and conversation management. The system features adaptive planning that automatically decides whether to use complex multi-step orchestration or simple direct responses based on query complexity.
You will need docker running a Redis server btw.
This was a personal projec build with Claude code on a rainy Sunday.
nest-agent-runtime is a sophisticated middleware/backend system for managing AI agent workflows. It acts as an orchestration layer between your application and various LLM providers (OpenAI, Anthropic), providing:
- Intelligent Request Routing: Automatically determines if a request needs complex multi-step planning or can be answered directly
- Multi-Agent Coordination: The Planner Agent breaks complex tasks into steps, executes them sequentially, and synthesizes results
- Tool Execution Framework: Extensible system for integrating custom tools (calculator, web search, database queries, API calls, etc.)
- Contextual Awareness: Automatically assembles context from conversation history, semantic memory, and RAG documents
- Dual RAG Backend: Production-ready document retrieval using Ragie AI with Supabase fallback
- Model Provider Abstraction: Seamlessly switch between OpenAI and Anthropic with automatic fallback on errors
- Complete Observability: Every agent step, tool call, and model interaction is logged and tracked
The system is designed for building AI applications that need more than simple prompt/response patterns—complex workflows, tool integrations, multi-turn reasoning, and contextual document retrieval.
User Request → Orchestration Service
↓
├─→ [1] Context Assembly (history + memory + RAG)
├─→ [2] Adaptive Planning Decision
│ ├─→ Simple Query → Default Agent → Model Router → Direct Response
│ └─→ Complex Query → Planner Agent → Execution Engine
│ ├─→ Generate multi-step plan
│ ├─→ Execute each step (tools + models)
│ └─→ Synthesize final response
└─→ [3] Persistence (save messages, steps, tool calls)
- Adaptive Planning: Most queries go directly to DefaultAgent for fast responses; complex orchestration requests trigger PlannerAgent
- Module-Based Separation: Each domain (agents, tools, models, context) is isolated in its own NestJS module
- Strategy Pattern: ModelRouter and AgentExecutor dynamically select implementations
- Database-First: All interactions are persisted in Supabase for analytics, debugging, and RLS security
- Dual RAG Backend: Ragie AI for production with automatic fallback to Supabase + pgvector
nest-agent-runtime/
├── src/
│ ├── main.ts # Application bootstrap
│ ├── app.module.ts # Root module
│ │
│ ├── config/ # Environment configuration
│ │ ├── configuration.ts # Config mapping
│ │ └── env.validation.ts # Validation schemas
│ │
│ ├── database/ # Database layer
│ │ ├── supabase.service.ts # Supabase client
│ │ ├── cache.service.ts # Redis caching
│ │ ├── schemas/ # TypeScript entity definitions
│ │ └── migrations/ # SQL schemas with pgvector
│ │
│ ├── security/ # Authentication & security
│ │ ├── auth.guard.ts # JWT authentication
│ │ ├── optional-auth.guard.ts # Optional auth
│ │ ├── jwt.strategy.ts # Passport JWT strategy
│ │ └── pii-filter.service.ts # PII detection
│ │
│ ├── conversations/ # Conversation management
│ │ ├── conversations.service.ts # CRUD operations
│ │ ├── messages.service.ts # Message handling
│ │ └── conversations.controller.ts # Analytics endpoint
│ │
│ ├── context/ # Context building
│ │ ├── context.service.ts # Assembles context from all sources
│ │ ├── memory.service.ts # Semantic memory with vectors
│ │ └── rag.service.ts # RAG with dual backend
│ │
│ ├── models/ # LLM providers
│ │ ├── model-router.service.ts # Provider selection & fallback
│ │ ├── openai.provider.ts # OpenAI integration
│ │ ├── anthropic.provider.ts # Anthropic integration
│ │ └── base.provider.ts # Abstract provider interface
│ │
│ ├── tools/ # Tool execution system
│ │ ├── tool-registry.service.ts # Tool registration
│ │ ├── tool-executor.service.ts # Execution & tracking
│ │ ├── base-tool.ts # Abstract tool class
│ │ └── tools/ # Individual tool implementations
│ │ ├── calculator.tool.ts
│ │ └── web-search.tool.ts
│ │
│ ├── agents/ # Agent system
│ │ ├── agent-executor.service.ts # Agent registry & execution
│ │ ├── base-agent.ts # Abstract agent class
│ │ └── agents/ # Individual agent implementations
│ │ ├── planner-agent.service.ts # Multi-step planning
│ │ └── default-agent.service.ts # Direct responses
│ │
│ ├── mcp/ # Model Context Protocol
│ │ └── mcp-client.service.ts # MCP integration (placeholder)
│ │
│ ├── streaming/ # SSE streaming
│ │ └── stream-manager.service.ts # Stream management
│ │
│ └── orchestration/ # Core orchestration
│ ├── orchestration.controller.ts # Main API endpoint
│ ├── orchestration.service.ts # Orchestration logic
│ ├── execution-engine.service.ts # Plan execution
│ └── dto/ # Request/response types
│
├── web-ui/ # Next.js frontend
│ ├── app/ # Next.js 16 app router
│ ├── components/ # React components
│ │ ├── chat-interface.tsx # Main chat UI
│ │ ├── chat-message.tsx # Message display
│ │ ├── chat-input.tsx # User input
│ │ └── options-panel.tsx # Configuration sidebar
│ └── lib/
│ └── api-client.ts # Type-safe backend client
│
├── .env.example # Environment template
├── package.json # Backend dependencies
└── tsconfig.json # TypeScript config
Adaptive Planning System:
- The system intelligently decides whether to use simple responses or complex orchestration
- Default Agent: Handles straightforward queries directly (e.g., "What is TypeScript?")
- Planner Agent: Breaks complex tasks into steps (e.g., "Research X, calculate Y, then summarize")
Execution Flow:
// Simple query → DefaultAgent
"What is 5 + 5?" → DefaultAgent.execute()
→ ModelRouter.chat("What is 5 + 5?")
→ Direct response: "10"
// Complex query → PlannerAgent
"Calculate 5 * 10, then add 25" → PlannerAgent.execute()
→ ModelRouter.chat() → Generates plan:
Step 1: Use calculator tool for 5 * 10
Step 2: Use calculator tool for result + 25
→ ExecutionEngine.executePlan()
→ Executes each step sequentially
→ Tracks all tool calls in database
→ Final response: "75"Agent Registry:
- Agents are dynamically registered in
AgentExecutorService - Easy to add custom agents by extending
BaseAgent - Each agent execution is tracked in
agent_stepstable
Built-in Tools:
- Calculator (
calculator): Arithmetic operations (add,subtract,multiply,divide) - Web Search (
web_search): Web search capability (placeholder for API integration)
Tool Architecture:
// Every tool extends BaseTool
abstract class BaseTool {
abstract name: string;
abstract description: string;
abstract schema: object; // JSON schema for input validation
abstract execute(input: ToolInput): Promise<ToolOutput>;
}
// Tool execution is tracked with:
- Input parameters
- Output results
- Execution time
- Confidence score
- Status (pending/running/completed/failed)Tool Call Tracking:
- Every tool execution is logged to
tool_callstable - Includes input, output, timestamps, and metadata
- Enables debugging and analytics
Context Assembly Process:
ContextService.buildContext(conversationId, options) {
1. Load conversation history (configurable window: 1-50 messages)
2. Search semantic memory (if enabled)
- Embeds query → searches conversation_memory table
3. Search RAG documents (if enabled)
- Uses Ragie AI or Supabase + pgvector
4. Combine all sources into formatted context string
}Context Components:
- Conversation History: Recent messages with role labels (user/assistant/system)
- Semantic Memory: Vector-based search within conversation for relevant past exchanges
- RAG Documents: External knowledge from uploaded documents
Configuration Options:
context: {
historyWindow: 10, // Number of recent messages (1-50)
includeHistory: true, // Include conversation history
useMemory: false, // Enable semantic memory search
useRag: false, // Enable RAG document search
ragQuery?: string // Custom RAG query (defaults to user input)
}Dual Backend Architecture:
| Aspect | Ragie AI (Primary) | Supabase + pgvector (Fallback) |
|---|---|---|
| Setup | API key only | Requires pgvector extension + OpenAI embeddings |
| Chunking | Automatic intelligent chunking | Manual text splitting |
| Embeddings | Automatic with optimized models | OpenAI text-embedding-3-small |
| Reranking | Built-in semantic reranking | Not available |
| Multi-format | PDF, DOCX, TXT, etc. | Text only |
| Performance | Optimized for speed | Good performance |
| Cost | Pay per API call | Database storage + OpenAI embedding costs |
Intelligent Fallback Flow:
1. Check if RAGIE_API_KEY is configured
├─→ YES: Use Ragie AI
│ ├─→ Success: Return results
│ └─→ Failure/Timeout: Fall back to Supabase
└─→ NO: Use Supabase + pgvector
├─→ Generate embeddings with OpenAI
└─→ Search rag_documents table
Usage:
// Upload documents
await ragService.uploadFile(userId, fileBuffer, {
category: 'documentation',
version: '1.0'
});
// Search documents
const results = await ragService.searchByText(
userId,
'authentication flow',
limit: 10
);
// In orchestration requests
{
"input": { "type": "text", "content": "How do I reset my password?" },
"options": {
"context": {
"useRag": true,
"ragQuery": "password reset procedure"
}
}
}Supported Providers:
- OpenAI:
gpt-4-turbo-preview,gpt-4,gpt-3.5-turbo - Anthropic:
claude-3-opus-20240229,claude-3-sonnet-20240229,claude-3-haiku-20240307
Model Selection Strategy:
ModelRouter.chat() {
1. Determine primary model:
- If preferredModel specified → Use it
- If preferredProvider specified → Use provider's default
- Else → Use default (gpt-4-turbo-preview)
2. Attempt primary model call
├─→ Success: Return response
└─→ Failure: Try fallback models in order
3. Track usage:
- Token counts (input/output)
- Estimated cost
- Model used
- Execution time
}Cost Estimation:
- Rough cost calculation per request based on token usage
- Logged in metadata for budget tracking
Provider Abstraction:
abstract class BaseProvider {
abstract chat(messages: Message[], options: ChatOptions): Promise<ChatResponse>;
abstract embedText(text: string): Promise<number[]>;
}
// Easy to add new providers by implementing BaseProviderConversation Lifecycle:
1. Create/Resume Conversation
- ConversationsService.create() or .findById()
- Stores metadata: userId, title, createdAt, updatedAt
2. Store Messages
- MessagesService.create()
- Tracks: role, content, contentType, metadata, parentMessageId
3. Build Context
- ContextService loads history from messages table
- Applies history window configuration
4. Generate Response
- Orchestration executes agent/tool workflows
5. Persist Response
- Save assistant message with full metadata
- Link to parent user messageMessage Types:
- User Messages: Human input
- Assistant Messages: AI-generated responses
- System Messages: Context, instructions, errors
Analytics:
GET /api/v1/conversations/:id/analytics
// Returns:
- Total messages
- Average response time
- Tool usage statistics
- Token consumption
- Agent activity breakdownAuthentication Flow:
1. User authenticates with Supabase
→ Receives JWT token
2. Request includes: Authorization: Bearer <token>
3. AuthGuard validates JWT
- Extracts userId from token
- Attaches to request.user
4. Row Level Security (RLS) enforces access
- Users can only access their own data
- Policies defined in migrationsSecurity Features:
- JWT Authentication: Supabase-issued tokens
- Row Level Security: Database-level isolation
- PII Filtering: Optional PII detection and removal (configurable)
- Rate Limiting: Redis-based request throttling (infrastructure ready)
- Audit Logging: Complete execution trail in
agent_stepsandtool_calls
RLS Policies (from migration):
-- Users can only select their own conversations
CREATE POLICY "Users can view own conversations"
ON conversations FOR SELECT
USING (auth.uid() = user_id);
-- Similar policies for messages, agent_steps, tool_calls, etc.Technology Stack:
- Next.js 16 (App Router)
- React 19
- TailwindCSS 4
- React Markdown with GitHub Flavored Markdown
Features:
- Real-time chat interface with markdown rendering
- Configuration sidebar for all orchestration options
- Dark mode support (automatic)
- Type-safe API client
- Conversation history display
- Loading states and error handling
Running the Web UI:
cd web-ui
pnpm install
pnpm dev
# Open http://localhost:3001API Integration:
- Web UI calls backend at
http://localhost:3000/api - Configurable via environment variables
- Type-safe client in
web-ui/lib/api-client.ts
- Node.js 18+ (20+ recommended)
- pnpm (install via
npm install -g pnpm) - Supabase account (supabase.com)
- OpenAI API key (platform.openai.com)
- Anthropic API key (optional) (console.anthropic.com)
- Ragie API key (optional, for RAG) (ragie.ai)
- Redis (optional, for caching)
cd C:\Users\dacs2\git\nest-agent-runtime
pnpm install# Copy example environment file
cp .env.example .env
# Edit .env with your credentialsMinimum required variables:
# Server
PORT=3000
NODE_ENV=development
# Supabase (required)
SUPABASE_URL=https://xxxxx.supabase.co
SUPABASE_ANON_KEY=your-anon-key
SUPABASE_SERVICE_ROLE_KEY=your-service-role-key
# OpenAI (required for embeddings and model)
OPENAI_API_KEY=sk-xxx
# Anthropic (optional)
ANTHROPIC_API_KEY=sk-ant-xxx
# Security
JWT_SECRET=your-random-secretOptional variables:
# Ragie AI (for production RAG)
RAGIE_API_KEY=your-ragie-key
# Redis (for caching)
REDIS_HOST=localhost
REDIS_PORT=6379
# Features
ENABLE_PII_FILTERING=false
ENABLE_AUDIT_LOGGING=true
MAX_RECURSION_DEPTH=10
MAX_TOOL_CALLS_PER_REQUEST=50A. Create a Supabase project:
- Go to supabase.com
- Create new project
- Copy URL and keys to
.env
B. Enable pgvector extension:
-- In Supabase SQL Editor
CREATE EXTENSION IF NOT EXISTS vector;C. Run migrations:
- Open
src/database/migrations/001_initial_schema.sql - Copy entire contents
- Paste into Supabase SQL Editor
- Execute
This creates:
conversationstablemessagestableagent_stepstabletool_callstableconversation_memorytable (with vector embeddings)rag_documentstable (with vector embeddings)- All RLS policies
- Necessary indexes
# Development mode (with hot reload)
pnpm run start:dev
# Backend runs on: http://localhost:3000/api# In a new terminal
cd web-ui
pnpm install
pnpm dev
# Web UI runs on: http://localhost:3001Using Web UI:
- Navigate to
http://localhost:3001 - Start chatting!
Using cURL:
curl -X POST http://localhost:3000/api/v1/orchestrate \
-H "Content-Type: application/json" \
-d '{
"input": {
"type": "text",
"content": "What is 25 times 16?"
},
"options": {
"tools": {
"allowedTools": ["calculator"]
}
}
}'Main endpoint for orchestrating LLM requests with agents and tools.
Authentication: Optional (uses OptionalAuthGuard)
Authorization: Bearer <supabase-jwt-token>
Request Body:
{
conversationId?: string; // Resume existing conversation
input: {
type: 'text'; // Currently only 'text' supported
content: string; // User message
};
options?: {
execution?: {
maxSteps?: number; // Max planning steps (default: 10)
parallelToolExecution?: boolean; // Execute tools in parallel (default: false)
};
agents?: {
allowedAgents?: string[]; // Restrict to specific agents
};
tools?: {
allowedTools?: string[]; // Restrict to specific tools
maxToolCalls?: number; // Max tool calls per request (default: 50)
};
models?: {
preferredModel?: string; // e.g., "gpt-4-turbo-preview"
fallbackModels?: string[]; // Fallback model priority list
preferredProvider?: string; // "openai" or "anthropic"
};
context?: {
historyWindow?: number; // Number of messages to include (1-50)
includeHistory?: boolean; // Include conversation history
useMemory?: boolean; // Enable semantic memory search
useRag?: boolean; // Enable RAG document search
ragQuery?: string; // Custom RAG query
};
response?: {
includeIntermediateSteps?: boolean; // Include agent steps in response
};
security?: {
enablePiiFiltering?: boolean; // Filter PII from response
};
};
}Response:
{
conversationId: string;
message: {
id: string;
role: 'assistant';
content: string;
contentType: 'text';
createdAt: string;
metadata?: {
agentUsed?: string;
toolsCalled?: string[];
intermediateSteps?: Array<{
step: number;
action: string;
observation: string;
}>;
};
};
metadata: {
modelUsed: string;
tokensUsed: {
input: number;
output: number;
};
estimatedCost?: number;
agentActivity: Array<{
agent: string;
steps: number;
toolCalls: number;
}>;
executionTime: number; // milliseconds
};
}Example Requests:
// 1. Simple Calculator Request
{
"input": {
"type": "text",
"content": "What is 125 divided by 5?"
},
"options": {
"tools": {
"allowedTools": ["calculator"]
}
}
}
// 2. Multi-Step Planning
{
"input": {
"type": "text",
"content": "Calculate 50 * 2, then subtract 25 from the result"
},
"options": {
"execution": {
"maxSteps": 5
},
"tools": {
"allowedTools": ["calculator"]
},
"response": {
"includeIntermediateSteps": true
}
}
}
// 3. Context-Aware Conversation
{
"conversationId": "existing-uuid",
"input": {
"type": "text",
"content": "What did we discuss earlier?"
},
"options": {
"context": {
"historyWindow": 20,
"includeHistory": true,
"useMemory": true
}
}
}
// 4. RAG-Enhanced Query
{
"input": {
"type": "text",
"content": "What is our company's refund policy?"
},
"options": {
"context": {
"useRag": true,
"ragQuery": "refund policy procedures"
}
}
}
// 5. Model Selection with Fallback
{
"input": {
"type": "text",
"content": "Explain quantum computing"
},
"options": {
"models": {
"preferredModel": "gpt-4-turbo-preview",
"fallbackModels": [
"claude-3-opus-20240229",
"gpt-3.5-turbo"
]
}
}
}Get detailed analytics for a specific conversation.
Authentication: Required
Response:
{
conversationId: string;
totalMessages: number;
averageResponseTime: number;
toolUsage: {
[toolName: string]: number;
};
totalTokensUsed: number;
estimatedTotalCost: number;
agentActivity: {
[agentName: string]: {
invocations: number;
averageSteps: number;
};
};
}Backend:
pnpm run start:dev # Start with hot reload (recommended)
pnpm run start:debug # Start with debugging enabled
pnpm run start # Standard start
pnpm run start:prod # Production mode (requires build)
pnpm run build # Compile TypeScript to dist/
pnpm run format # Format code with Prettier
pnpm run lint # Lint and fix with ESLint
pnpm run test # Run unit tests
pnpm run test:watch # Run tests in watch mode
pnpm run test:cov # Generate coverage report
pnpm run test:e2e # Run end-to-end tests
pnpm run test:debug # Debug testsWeb UI:
cd web-ui
pnpm dev # Start dev server (port 3001)
pnpm build # Build for production
pnpm start # Start production server
pnpm lint # Lint codeUnit Tests:
pnpm run testE2E Tests:
pnpm run test:e2eCoverage:
pnpm run test:cov
# Opens coverage report in browserThis section provides detailed guidance on modifying and extending different features of the codebase.
Tools extend the agent's capabilities with external actions (API calls, calculations, database queries, etc.).
Step 1: Create Tool Class
Create a new file in src/tools/tools/:
// src/tools/tools/weather.tool.ts
import { Injectable, Logger } from '@nestjs/common';
import { BaseTool, ToolInput, ToolOutput } from '../base-tool';
@Injectable()
export class WeatherTool extends BaseTool {
private readonly logger = new Logger(WeatherTool.name);
// Unique identifier (used in API requests)
readonly name = 'weather';
// Description for LLM to understand when to use this tool
readonly description =
'Get current weather information for a specified location. ' +
'Input should be a city name or coordinates.';
// JSON schema defining expected input format
readonly schema = {
type: 'object',
properties: {
location: {
type: 'string',
description: 'City name or coordinates (lat,lon)',
},
units: {
type: 'string',
enum: ['celsius', 'fahrenheit'],
description: 'Temperature units',
default: 'celsius',
},
},
required: ['location'],
};
// Main execution method
async execute(input: ToolInput): Promise<ToolOutput> {
try {
const { location, units = 'celsius' } = input;
this.logger.log(`Fetching weather for ${location}`);
// Your implementation here
const weatherData = await this.fetchWeatherFromAPI(location, units);
return {
success: true,
result: weatherData,
metadata: {
location,
units,
timestamp: new Date().toISOString(),
},
};
} catch (error) {
this.logger.error(`Weather fetch failed: ${error.message}`);
return {
success: false,
error: error.message,
};
}
}
private async fetchWeatherFromAPI(location: string, units: string) {
// Implement API call
// Example: const response = await fetch(`https://api.weather.com/...`);
return {
temperature: 72,
conditions: 'Partly Cloudy',
location,
};
}
}Step 2: Register Tool
In src/tools/tool-registry.service.ts, add your tool to the constructor:
@Injectable()
export class ToolRegistryService {
constructor(
private readonly calculatorTool: CalculatorTool,
private readonly webSearchTool: WebSearchTool,
private readonly weatherTool: WeatherTool, // Add this
) {
this.registerTool(this.calculatorTool);
this.registerTool(this.webSearchTool);
this.registerTool(this.weatherTool); // Add this
}
// ... rest of class
}Step 3: Add to Module Providers
In src/tools/tools.module.ts:
@Module({
imports: [ModelsModule],
providers: [
ToolRegistryService,
ToolExecutorService,
CalculatorTool,
WebSearchTool,
WeatherTool, // Add this
],
exports: [ToolRegistryService, ToolExecutorService],
})
export class ToolsModule {}Step 4: Use in Requests
{
"input": {
"type": "text",
"content": "What's the weather in San Francisco?"
},
"options": {
"tools": {
"allowedTools": ["weather"]
}
}
}Agents implement different reasoning strategies (planning, research, code generation, etc.).
Step 1: Create Agent Class
Create a new file in src/agents/agents/:
// src/agents/agents/research-agent.service.ts
import { Injectable, Logger } from '@nestjs/common';
import { BaseAgent, AgentInput, AgentOutput } from '../base-agent';
import { ModelRouterService } from '../../models/model-router.service';
import { ToolExecutorService } from '../../tools/tool-executor.service';
@Injectable()
export class ResearchAgentService extends BaseAgent {
private readonly logger = new Logger(ResearchAgentService.name);
constructor(
private readonly modelRouter: ModelRouterService,
private readonly toolExecutor: ToolExecutorService,
) {
super();
}
// Unique identifier
readonly name = 'research';
// Description for when this agent should be used
readonly description =
'Specialized agent for research tasks that require gathering ' +
'information from multiple sources and synthesizing findings.';
async execute(input: AgentInput): Promise<AgentOutput> {
try {
this.logger.log('Research agent executing');
const { context, userInput, options } = input;
// Step 1: Generate research plan
const planPrompt = this.buildResearchPlanPrompt(userInput, context);
const plan = await this.modelRouter.chat(
[{ role: 'user', content: planPrompt }],
options.models || {},
);
// Step 2: Execute research steps
const findings: string[] = [];
const steps = this.parseResearchSteps(plan.content);
for (const step of steps) {
if (step.requiresTool) {
const result = await this.toolExecutor.executeTool(
step.toolName,
step.toolInput,
);
findings.push(result.result);
} else {
const response = await this.modelRouter.chat(
[{ role: 'user', content: step.query }],
options.models || {},
);
findings.push(response.content);
}
}
// Step 3: Synthesize findings
const synthesisPrompt = this.buildSynthesisPrompt(userInput, findings);
const synthesis = await this.modelRouter.chat(
[{ role: 'user', content: synthesisPrompt }],
options.models || {},
);
return {
content: synthesis.content,
done: true,
metadata: {
researchSteps: steps.length,
toolsUsed: steps.filter(s => s.requiresTool).length,
},
};
} catch (error) {
this.logger.error(`Research agent failed: ${error.message}`);
return {
content: `Research failed: ${error.message}`,
done: true,
error: error.message,
};
}
}
private buildResearchPlanPrompt(query: string, context: string): string {
return `Given this research query: "${query}"
Context: ${context}
Generate a step-by-step research plan. For each step, indicate if it requires a tool call.`;
}
private parseResearchSteps(planContent: string): any[] {
// Parse the plan into structured steps
// Implementation depends on your plan format
return [];
}
private buildSynthesisPrompt(query: string, findings: string[]): string {
return `Synthesize these research findings to answer: "${query}"
Findings:
${findings.map((f, i) => `${i + 1}. ${f}`).join('\n')}`;
}
}Step 2: Register Agent
In src/agents/agent-executor.service.ts:
@Injectable()
export class AgentExecutorService {
constructor(
private readonly plannerAgent: PlannerAgentService,
private readonly defaultAgent: DefaultAgentService,
private readonly researchAgent: ResearchAgentService, // Add this
) {
this.registerAgent(this.plannerAgent);
this.registerAgent(this.defaultAgent);
this.registerAgent(this.researchAgent); // Add this
}
// ... rest of class
}Step 3: Add to Module Providers
In src/agents/agents.module.ts:
@Module({
imports: [ModelsModule, ToolsModule, ContextModule],
providers: [
AgentExecutorService,
PlannerAgentService,
DefaultAgentService,
ResearchAgentService, // Add this
],
exports: [AgentExecutorService],
})
export class AgentsModule {}Step 4: Modify Orchestration Logic
In src/orchestration/orchestration.service.ts, update the agent selection logic:
private selectAgent(input: string, options: OrchestrationOptions): string {
// Add logic to detect research queries
const researchKeywords = ['research', 'investigate', 'analyze', 'study'];
const isResearchQuery = researchKeywords.some(kw =>
input.toLowerCase().includes(kw)
);
if (isResearchQuery) {
return 'research';
}
// Existing logic
if (this.requiresPlanning(input, options)) {
return 'planner';
}
return 'default';
}Step 5: Use in Requests
{
"input": {
"type": "text",
"content": "Research the latest developments in quantum computing"
},
"options": {
"agents": {
"allowedAgents": ["research"]
},
"tools": {
"allowedTools": ["web_search"]
}
}
}Add support for additional LLM providers (e.g., Google PaLM, Cohere, etc.).
Step 1: Create Provider Class
Create a new file in src/models/providers/:
// src/models/providers/google.provider.ts
import { Injectable, Logger } from '@nestjs/common';
import { ConfigService } from '@nestjs/config';
import {
BaseProvider,
ChatMessage,
ChatOptions,
ChatResponse,
EmbeddingResponse,
} from '../base.provider';
@Injectable()
export class GoogleProvider extends BaseProvider {
private readonly logger = new Logger(GoogleProvider.name);
private readonly apiKey: string;
constructor(private readonly configService: ConfigService) {
super();
this.apiKey = this.configService.get<string>('GOOGLE_API_KEY');
}
readonly name = 'google';
readonly defaultModel = 'gemini-pro';
async chat(
messages: ChatMessage[],
options: ChatOptions = {},
): Promise<ChatResponse> {
try {
this.logger.log(`Google chat with model: ${options.model || this.defaultModel}`);
// Convert message format
const googleMessages = this.convertMessages(messages);
// Make API call (pseudo-code)
const response = await fetch('https://generativelanguage.googleapis.com/v1/...', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${this.apiKey}`,
},
body: JSON.stringify({
contents: googleMessages,
generationConfig: {
temperature: options.temperature || 0.7,
maxOutputTokens: options.maxTokens || 2048,
},
}),
});
const data = await response.json();
return {
content: data.candidates[0].content.parts[0].text,
model: options.model || this.defaultModel,
usage: {
inputTokens: data.usageMetadata.promptTokenCount,
outputTokens: data.usageMetadata.candidatesTokenCount,
},
finishReason: 'stop',
};
} catch (error) {
this.logger.error(`Google API error: ${error.message}`);
throw error;
}
}
async embedText(text: string): Promise<EmbeddingResponse> {
// Implement embedding using Google's embedding models
try {
const response = await fetch('https://generativelanguage.googleapis.com/v1/...', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${this.apiKey}`,
},
body: JSON.stringify({
model: 'embedding-001',
content: { parts: [{ text }] },
}),
});
const data = await response.json();
return {
embedding: data.embedding.values,
model: 'embedding-001',
};
} catch (error) {
this.logger.error(`Google embedding error: ${error.message}`);
throw error;
}
}
private convertMessages(messages: ChatMessage[]): any[] {
// Convert from internal format to Google's format
return messages.map(msg => ({
role: msg.role === 'assistant' ? 'model' : 'user',
parts: [{ text: msg.content }],
}));
}
}Step 2: Register Provider
In src/models/model-router.service.ts:
@Injectable()
export class ModelRouterService {
constructor(
private readonly openaiProvider: OpenAIProvider,
private readonly anthropicProvider: AnthropicProvider,
private readonly googleProvider: GoogleProvider, // Add this
) {
this.registerProvider(this.openaiProvider);
this.registerProvider(this.anthropicProvider);
this.registerProvider(this.googleProvider); // Add this
}
// Update model-to-provider mapping
private getProviderForModel(modelName: string): BaseProvider {
if (modelName.startsWith('gpt')) {
return this.openaiProvider;
}
if (modelName.startsWith('claude')) {
return this.anthropicProvider;
}
if (modelName.startsWith('gemini')) {
return this.googleProvider; // Add this
}
return this.openaiProvider; // default
}
}Step 3: Add to Module
In src/models/models.module.ts:
@Module({
imports: [ConfigModule],
providers: [
ModelRouterService,
OpenAIProvider,
AnthropicProvider,
GoogleProvider, // Add this
],
exports: [ModelRouterService],
})
export class ModelsModule {}Step 4: Add Environment Variable
In .env:
GOOGLE_API_KEY=your-google-api-keyIn src/config/configuration.ts:
export default () => ({
// ... existing config
google: {
apiKey: process.env.GOOGLE_API_KEY,
},
});Step 5: Use in Requests
{
"input": {
"type": "text",
"content": "Explain machine learning"
},
"options": {
"models": {
"preferredModel": "gemini-pro",
"fallbackModels": ["gpt-4-turbo-preview"]
}
}
}Customize how context is assembled from different sources.
File: src/context/context.service.ts
Key Method: buildContext(conversationId, options)
Example Modifications:
// Add custom context sources
async buildContext(
conversationId: string,
options: ContextOptions,
): Promise<string> {
const contextParts: string[] = [];
// 1. Existing: Conversation history
if (options.includeHistory) {
const history = await this.loadHistory(conversationId, options.historyWindow);
contextParts.push(this.formatHistory(history));
}
// 2. Existing: Semantic memory
if (options.useMemory) {
const memory = await this.memoryService.searchMemory(conversationId, query);
contextParts.push(this.formatMemory(memory));
}
// 3. Existing: RAG documents
if (options.useRag) {
const ragDocs = await this.ragService.searchByText(userId, ragQuery, 10);
contextParts.push(this.formatRagDocs(ragDocs));
}
// 4. NEW: Add user preferences
const userPrefs = await this.loadUserPreferences(userId);
if (userPrefs) {
contextParts.push(`User Preferences:\n${JSON.stringify(userPrefs, null, 2)}`);
}
// 5. NEW: Add current time/date
contextParts.push(`Current Date/Time: ${new Date().toISOString()}`);
// 6. NEW: Add system context (API availability, etc.)
const systemContext = await this.getSystemContext();
contextParts.push(`System Status:\n${systemContext}`);
return contextParts.join('\n\n---\n\n');
}
private async loadUserPreferences(userId: string): Promise<any> {
// Load from database
const { data } = await this.supabase
.from('user_preferences')
.select('*')
.eq('user_id', userId)
.single();
return data;
}
private async getSystemContext(): Promise<string> {
// Check API availability, current load, etc.
return 'All systems operational';
}Modify how the system decides between agents and executes workflows.
File: src/orchestration/orchestration.service.ts
Key Methods:
orchestrate(): Main entry pointrequiresPlanning(): Decides if planner agent is neededselectAgent(): Chooses which agent to use
Example Modifications:
// Customize agent selection logic
private requiresPlanning(input: string, options: OrchestrationOptions): boolean {
// Existing logic
const planningKeywords = ['calculate then', 'first...then', 'step by step'];
const hasMultipleSteps = planningKeywords.some(kw =>
input.toLowerCase().includes(kw)
);
// NEW: Add complexity scoring
const complexityScore = this.calculateComplexityScore(input);
const isComplex = complexityScore > 0.7;
// NEW: Check tool requirements
const requestedTools = options.tools?.allowedTools || [];
const requiresMultipleTools = requestedTools.length > 1;
// NEW: Check if explicitly requesting planning
const explicitPlanRequest = /plan|strategy|approach/i.test(input);
return hasMultipleSteps || isComplex || requiresMultipleTools || explicitPlanRequest;
}
private calculateComplexityScore(input: string): number {
let score = 0;
// Length-based scoring
if (input.length > 200) score += 0.2;
if (input.length > 500) score += 0.3;
// Question complexity
const questions = (input.match(/\?/g) || []).length;
score += questions * 0.1;
// Conditional words
const conditionals = ['if', 'when', 'unless', 'before', 'after'];
conditionals.forEach(word => {
if (input.toLowerCase().includes(word)) score += 0.1;
});
// Action verbs
const actionVerbs = ['calculate', 'analyze', 'compare', 'research', 'investigate'];
const actionCount = actionVerbs.filter(verb =>
input.toLowerCase().includes(verb)
).length;
score += actionCount * 0.15;
return Math.min(score, 1.0);
}
// Add custom workflow execution
async executeCustomWorkflow(input: AgentInput): Promise<AgentOutput> {
// Implement custom multi-agent workflow
// Example: Research Agent → Analysis Agent → Summary Agent
const researchResult = await this.agentExecutor.executeAgent('research', input);
const analysisInput = {
...input,
userInput: `Analyze this research: ${researchResult.content}`,
};
const analysisResult = await this.agentExecutor.executeAgent('analysis', analysisInput);
const summaryInput = {
...input,
userInput: `Summarize: ${analysisResult.content}`,
};
const summaryResult = await this.agentExecutor.executeAgent('summary', summaryInput);
return summaryResult;
}Add new tables or columns to track additional data.
Step 1: Create New Migration
Create src/database/migrations/002_add_user_preferences.sql:
-- Create user preferences table
CREATE TABLE IF NOT EXISTS public.user_preferences (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
user_id UUID NOT NULL REFERENCES auth.users(id) ON DELETE CASCADE,
preferences JSONB NOT NULL DEFAULT '{}',
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
UNIQUE(user_id)
);
-- Enable RLS
ALTER TABLE public.user_preferences ENABLE ROW LEVEL SECURITY;
-- RLS Policies
CREATE POLICY "Users can view own preferences"
ON public.user_preferences
FOR SELECT
USING (auth.uid() = user_id);
CREATE POLICY "Users can update own preferences"
ON public.user_preferences
FOR UPDATE
USING (auth.uid() = user_id);
CREATE POLICY "Users can insert own preferences"
ON public.user_preferences
FOR INSERT
WITH CHECK (auth.uid() = user_id);
-- Create index
CREATE INDEX idx_user_preferences_user_id ON public.user_preferences(user_id);
-- Auto-update timestamp
CREATE TRIGGER update_user_preferences_updated_at
BEFORE UPDATE ON public.user_preferences
FOR EACH ROW
EXECUTE FUNCTION update_updated_at_column();Step 2: Create TypeScript Schema
Create src/database/schemas/user-preferences.schema.ts:
export interface UserPreferences {
id: string;
user_id: string;
preferences: {
theme?: 'light' | 'dark';
language?: string;
defaultModel?: string;
notificationsEnabled?: boolean;
[key: string]: any;
};
created_at: string;
updated_at: string;
}Step 3: Create Service
Create src/preferences/preferences.service.ts:
import { Injectable } from '@nestjs/common';
import { SupabaseService } from '../database/supabase.service';
import { UserPreferences } from '../database/schemas/user-preferences.schema';
@Injectable()
export class PreferencesService {
constructor(private readonly supabase: SupabaseService) {}
async getUserPreferences(userId: string): Promise<UserPreferences | null> {
const { data, error } = await this.supabase.client
.from('user_preferences')
.select('*')
.eq('user_id', userId)
.single();
if (error) {
if (error.code === 'PGRST116') return null; // Not found
throw error;
}
return data;
}
async updateUserPreferences(
userId: string,
preferences: Partial<UserPreferences['preferences']>,
): Promise<UserPreferences> {
const { data, error } = await this.supabase.client
.from('user_preferences')
.upsert({
user_id: userId,
preferences,
updated_at: new Date().toISOString(),
})
.select()
.single();
if (error) throw error;
return data;
}
}Step 4: Run Migration
- Copy
002_add_user_preferences.sqlcontents - Paste into Supabase SQL Editor
- Execute
Extend the Next.js web interface with new components and functionality.
Example: Add a Settings Panel
Step 1: Create Settings Component
Create web-ui/components/settings-panel.tsx:
'use client';
import { useState } from 'react';
interface SettingsData {
theme: 'light' | 'dark';
defaultModel: string;
historyWindow: number;
}
export function SettingsPanel() {
const [settings, setSettings] = useState<SettingsData>({
theme: 'dark',
defaultModel: 'gpt-4-turbo-preview',
historyWindow: 10,
});
const handleSave = async () => {
// Save to backend
const response = await fetch('/api/v1/preferences', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(settings),
});
if (response.ok) {
alert('Settings saved!');
}
};
return (
<div className="p-4 bg-gray-800 rounded-lg">
<h2 className="text-xl font-bold mb-4">Settings</h2>
<div className="space-y-4">
{/* Theme */}
<div>
<label className="block text-sm font-medium mb-2">Theme</label>
<select
value={settings.theme}
onChange={(e) => setSettings({ ...settings, theme: e.target.value as any })}
className="w-full p-2 bg-gray-700 rounded"
>
<option value="light">Light</option>
<option value="dark">Dark</option>
</select>
</div>
{/* Default Model */}
<div>
<label className="block text-sm font-medium mb-2">Default Model</label>
<select
value={settings.defaultModel}
onChange={(e) => setSettings({ ...settings, defaultModel: e.target.value })}
className="w-full p-2 bg-gray-700 rounded"
>
<option value="gpt-4-turbo-preview">GPT-4 Turbo</option>
<option value="gpt-3.5-turbo">GPT-3.5 Turbo</option>
<option value="claude-3-opus-20240229">Claude 3 Opus</option>
</select>
</div>
{/* History Window */}
<div>
<label className="block text-sm font-medium mb-2">
History Window: {settings.historyWindow}
</label>
<input
type="range"
min="1"
max="50"
value={settings.historyWindow}
onChange={(e) => setSettings({ ...settings, historyWindow: parseInt(e.target.value) })}
className="w-full"
/>
</div>
<button
onClick={handleSave}
className="w-full py-2 bg-blue-600 hover:bg-blue-700 rounded"
>
Save Settings
</button>
</div>
</div>
);
}Step 2: Add to Main Layout
In web-ui/app/page.tsx:
import { ChatInterface } from '@/components/chat-interface';
import { SettingsPanel } from '@/components/settings-panel';
export default function Home() {
return (
<div className="flex h-screen">
<div className="flex-1">
<ChatInterface />
</div>
<div className="w-80 border-l border-gray-700">
<SettingsPanel />
</div>
</div>
);
}Enable real-time streaming for agent responses.
Step 1: Update Orchestration Controller
In src/orchestration/orchestration.controller.ts:
import { Sse, MessageEvent } from '@nestjs/common';
import { Observable } from 'rxjs';
@Sse('orchestrate-stream')
async orchestrateStream(
@Body() dto: OrchestrationRequestDto,
@Req() req: Request,
): Observable<MessageEvent> {
return new Observable((subscriber) => {
this.orchestrationService
.orchestrateWithStreaming(dto, req['user'])
.then(async (stream) => {
for await (const chunk of stream) {
subscriber.next({
data: JSON.stringify(chunk),
});
}
subscriber.complete();
})
.catch((error) => {
subscriber.error(error);
});
});
}Step 2: Update Orchestration Service
async *orchestrateWithStreaming(
dto: OrchestrationRequestDto,
user: any,
): AsyncGenerator<any> {
// Setup
const conversation = await this.conversationsService.getOrCreate(/*...*/);
const context = await this.contextService.buildContext(/*...*/);
yield { type: 'context', data: { contextLength: context.length } };
// Execute agent with streaming
const agentName = this.selectAgent(dto.input.content, dto.options);
yield { type: 'agent', data: { agentName } };
// Stream agent execution
const agentStream = this.agentExecutor.executeAgentWithStreaming(/*...*/);
for await (const chunk of agentStream) {
yield { type: 'chunk', data: chunk };
}
yield { type: 'complete', data: { done: true } };
}Step 3: Update Web UI
In web-ui/components/chat-interface.tsx:
const handleStreamingRequest = async (message: string) => {
const eventSource = new EventSource(
`http://localhost:3000/api/v1/orchestrate-stream`,
{
method: 'POST',
body: JSON.stringify({ input: { type: 'text', content: message } }),
}
);
let fullResponse = '';
eventSource.onmessage = (event) => {
const data = JSON.parse(event.data);
if (data.type === 'chunk') {
fullResponse += data.data;
setMessages((prev) => [
...prev.slice(0, -1),
{ role: 'assistant', content: fullResponse },
]);
} else if (data.type === 'complete') {
eventSource.close();
}
};
eventSource.onerror = (error) => {
console.error('Stream error:', error);
eventSource.close();
};
};- Set
NODE_ENV=production - Use strong, random
JWT_SECRET(minimum 32 characters) - Enable
ENABLE_AUDIT_LOGGING=true - Configure Redis for caching and rate limiting
- Set up monitoring (e.g., Sentry, DataDog, New Relic)
- Configure CORS for your frontend domain
- Use HTTPS/TLS for all connections
- Set appropriate rate limits in environment
- Configure log aggregation (e.g., CloudWatch, Papertrail)
- Set up database backups (Supabase has automatic backups)
- Configure health check endpoint
- Use connection pooling for database
- Set up CI/CD pipeline
# Server
NODE_ENV=production
PORT=3000
# Security
JWT_SECRET=<strong-random-32+-char-secret>
CORS_ORIGINS=https://yourdomain.com
# Database
SUPABASE_URL=https://xxx.supabase.co
SUPABASE_ANON_KEY=<anon-key>
SUPABASE_SERVICE_ROLE_KEY=<service-key>
# Redis (required for production)
REDIS_HOST=your-redis-host
REDIS_PORT=6379
REDIS_PASSWORD=<redis-password>
# Model Providers
OPENAI_API_KEY=sk-xxx
ANTHROPIC_API_KEY=sk-ant-xxx
# RAG
RAGIE_API_KEY=your-ragie-key
# Features
ENABLE_PII_FILTERING=true
ENABLE_AUDIT_LOGGING=true
MAX_RECURSION_DEPTH=10
MAX_TOOL_CALLS_PER_REQUEST=50
# Monitoring
SENTRY_DSN=<your-sentry-dsn>Create Dockerfile:
FROM node:20-alpine AS builder
WORKDIR /app
COPY package.json pnpm-lock.yaml ./
RUN npm install -g pnpm && pnpm install --frozen-lockfile
COPY . .
RUN pnpm run build
FROM node:20-alpine
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/package.json ./
EXPOSE 3000
CMD ["node", "dist/main.js"]Create docker-compose.yml:
version: '3.8'
services:
app:
build: .
ports:
- "3000:3000"
environment:
- NODE_ENV=production
- REDIS_HOST=redis
env_file:
- .env
depends_on:
- redis
redis:
image: redis:7-alpine
ports:
- "6379:6379"
volumes:
- redis_data:/data
volumes:
redis_data:1. Enable Caching:
- Configure Redis for conversation caching
- Cache frequently accessed RAG documents
- Cache embedding results
2. Database Optimization:
- Ensure indexes are created (included in migration)
- Use connection pooling
- Regular VACUUM and ANALYZE
3. Rate Limiting:
- Implement per-user rate limits
- Add IP-based rate limiting for unauthenticated requests
4. Model Optimization:
- Use faster models (GPT-3.5, Claude Haiku) for simple queries
- Reserve GPT-4/Opus for complex orchestration
- Implement response caching for common queries
1. "Cannot find module" errors after installation
# Clear node_modules and reinstall
rm -rf node_modules pnpm-lock.yaml
pnpm install2. Database connection errors
- Verify
SUPABASE_URLand keys in.env - Check Supabase project is active
- Ensure migrations have been run
3. "jwt must be provided" errors
- If using authenticated endpoints, include
Authorization: Bearer <token> - Or use optional auth endpoints like
/api/v1/orchestrate
4. RAG not working
- If using Ragie: Check
RAGIE_API_KEYis valid - If using Supabase: Ensure
pgvectorextension is enabled - Check documents are actually uploaded to the system
5. Tool execution failures
- Check tool is registered in
ToolRegistryService - Verify tool name matches exactly in request
- Check tool's
execute()method for errors
6. Web UI not connecting to backend
- Ensure backend is running on port 3000
- Check CORS configuration in
main.ts - Verify API base URL in web UI configuration
Enable detailed logging:
# Backend
pnpm run start:debug
# Set log level
export LOG_LEVEL=debug
pnpm run start:devCheck logs in:
- Console output
- Supabase logs (for database queries)
agent_stepstable (for agent execution)tool_callstable (for tool execution)
conversations
id: UUID (primary key)user_id: UUID (references auth.users)title: TEXTmetadata: JSONBcreated_at,updated_at: TIMESTAMPTZ
messages
id: UUID (primary key)conversation_id: UUID (references conversations)role: TEXT ('user', 'assistant', 'system')content: TEXTcontent_type: TEXTparent_message_id: UUID (nullable)metadata: JSONBcreated_at: TIMESTAMPTZ
agent_steps
id: UUID (primary key)conversation_id: UUIDagent_name: TEXTinput: JSONBoutput: JSONBstatus: TEXTstarted_at,completed_at: TIMESTAMPTZmetadata: JSONB
tool_calls
id: UUID (primary key)agent_step_id: UUID (references agent_steps)tool_name: TEXTinput: JSONBoutput: JSONBstatus: TEXTexecuted_at: TIMESTAMPTZmetadata: JSONB
conversation_memory
id: UUID (primary key)conversation_id: UUIDcontent: TEXTembedding: VECTOR(1536)metadata: JSONBcreated_at: TIMESTAMPTZ
rag_documents
id: UUID (primary key)user_id: UUIDcontent: TEXTembedding: VECTOR(1536)metadata: JSONBcreated_at: TIMESTAMPTZ
MIT
- NestJS - Progressive Node.js framework
- Supabase - PostgreSQL database + authentication
- OpenAI - GPT models and embeddings
- Anthropic - Claude models
- Ragie AI - Production RAG infrastructure
- Redis - Caching and rate limiting
- pgvector - Vector similarity search
- Next.js - React framework for web UI
- TailwindCSS - Utility-first CSS
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
For issues, questions, or feature requests, please open an issue on GitHub.
Built with ❤️ and Claude code.