Claudette v1.0.5 - Maximize Your AI Investment 🧠

🚀 Smart AI Middleware That Saves Money While Preserving Quality

v1.0.5: Get more from your AI budget by intelligently routing requests across multiple providers. Reduce costs while maintaining the quality your users expect.

🎯 What is Claudette?

Claudette is an AI middleware platform that helps you maximize your AI investment while maintaining quality. Instead of being locked into expensive single-provider solutions, Claudette intelligently routes your requests across multiple AI backends to deliver the best value.

💼 What Claudette Helps You With

🏢 For Businesses

Reduce AI costs by automatically choosing cost-effective backends for routine tasks
Extend subscription value - get significantly more AI interactions for the same budget
Avoid vendor lock-in with support for multiple AI providers
Scale confidently with built-in failover and health monitoring
Track spending with real-time cost monitoring and budget controls

👨‍💻 For Developers

Build AI features faster with a unified API across multiple providers
Prevent outages with automatic failover between AI services
Optimize performance with intelligent caching and routing
Debug easily with comprehensive logging and monitoring
Deploy reliably with production-tested infrastructure

🎓 For Teams & Projects

Make AI budgets last longer by optimizing every request
Ensure consistent quality while reducing costs
Simplify AI integration with one interface for multiple providers
Stay operational even when one AI service has issues
Scale usage without proportional cost increases

🌟 Real-World Use Cases

Content teams: Draft with cost-effective models, polish with premium ones - save budget for creative review
Development teams: Route code questions intelligently - simple syntax to fast models, architecture to specialized ones
Customer support: Handle routine inquiries efficiently while ensuring complex issues get premium treatment
Research projects: Optimize between speed and quality based on whether it's exploration or final analysis
Startups: Access multiple AI capabilities without multiple expensive subscriptions

🏆 Key Features

🔄 Smart Routing - Automatic selection between OpenAI, Claude, Qwen, and Ollama based on your needs
💰 Cost Intelligence - Real-time optimization to maximize your AI budget
💸 Low-Cost Providers - Access to 80-95% cheaper alternatives like Alibaba Cloud, DeepSeek, and free local models
📊 Transparency - Track performance, costs, and quality across all providers
🏗️ Developer Ready - Full TypeScript support with modern tooling
⚡ Performance - Intelligent caching and optimized request handling
🛡️ Reliability - Circuit breakers and graceful failure recovery

🚀 What's New in v1.0.5 - Advanced Memory Management & Ultra-Fast MCP

🧠 Advanced Memory Management System

Reduce memory pressure from 95% to 75-85% - Handle complex tasks without crashes

// Automatic memory optimization for complex tasks
const claudette = new Claudette({
  memory: {
    advancedManagement: true,    // NEW: Advanced memory pool management
    pressureOptimization: true,  // NEW: Pressure-based scaling
    emergencyCleanup: true,      // NEW: 75% cache reduction when needed
    complexTaskPrep: true       // NEW: Automatic memory prep for complex tasks
  }
});

// System automatically optimizes memory before complex operations
const response = await claudette.optimize({
  prompt: "Analyze this 50-page document and create detailed recommendations...",
  // Memory system automatically prepares resources
  // Reduces memory pressure from 94% → 75% before execution
});

⚡ Ultra-Fast MCP Server - 99.1% Startup Improvement

From 30 seconds to 264ms startup - Perfect for Claude Code integration

# Previous MCP startup: 30,000ms (30 seconds timeout)
# NEW Fast MCP startup: 264ms (sub-second!)

# Start the ultra-fast MCP server
node claudette-mcp-server-fast.js

# Benchmark all interfaces
node benchmark-all.js

# Performance Results:
# 🏆 MCP Server: 264ms startup (FASTEST)
# 🏆 MCP Requests: 896ms average (FASTEST) 
# 🏆 MCP Memory: 0.39MB growth (MOST EFFICIENT)

📊 Comprehensive Benchmarking Suite

Performance validation across all interfaces - Native, HTTP API, and MCP

# Test individual interfaces
./benchmark-native.js   # Test direct library usage
./benchmark-api.js      # Test HTTP REST API 
./benchmark-mcp.js      # Test MCP server performance

# Compare all interfaces
./benchmark-all.js      # Comprehensive comparison

# Results Summary:
# - Native: Best for single-process applications
# - HTTP API: Best for web services and REST integration
# - MCP: Best for Claude Code integration (now fastest!)

🔄 Harmonized Timeout System

Eliminate timeout conflicts - Intelligent retry with circuit breakers

// NEW: Unified timeout configuration
const config = {
  timeouts: {
    startup: 5000,        // 5s optimized startup
    query: 60000,         // 60s Claude Code compatible
    health: 3000,         // 3s health checks
    emergency: 90000      // 90s for complex tasks
  },
  retry: {
    maxAttempts: 3,       // Intelligent retry logic
    backoffMultiplier: 1.5, // Adaptive backoff
    circuitBreaker: true   // Prevent cascade failures
  }
};

🎯 Performance Improvements Summary

Component	v1.0.4	v1.0.5	Improvement
MCP Startup	30,000ms	264ms	113.6x faster
Memory Pressure	95% critical	75-85% managed	Memory crashes eliminated
Environment Loading	3,888ms	<100ms	38x faster
Complex Task Handling	Manual management	Automatic optimization	Zero-config scaling
Timeout Reliability	62.5% success	95%+ success	52% reliability improvement

📚 Table of Contents

🚀 Quick Start
💰 Claude Subscription Optimization Guide
💸 Low-Cost Token Providers & Inference Services
🔧 API Usage
📖 Documentation
🤝 Contributing
🐛 Support & Issues

🚀 Quick Start

⚡ See the Value in 2 Minutes

# Install Claudette
npm install -g claudette
claudette init

# Make your first optimized request
claudette "Explain machine learning" --verbose

# See the cost savings and backend selection in action
# Claudette automatically chose the most cost-effective backend
# while maintaining quality standards

💡 What Just Happened?

Instead of paying premium rates for a simple explanation, Claudette:

Analyzed your request - determined it was educational content
Selected the optimal backend - chose a cost-effective model that excels at explanations
Delivered quality results - maintained high response quality while reducing costs
Showed transparency - displayed which backend was used and the actual cost

📦 Installation Options

# Option 1: NPM Installation (Recommended)
npm install -g claudette
claudette init

# Option 2: Source Installation
git clone https://github.com/RobLe3/claudette.git
cd claudette
npm install && npm run build

🔧 Configuration

Copy environment template:
```
cp .env.example .env
```

Configure your credentials:

# Required: OpenAI API Key
OPENAI_API_KEY=sk-your-openai-api-key-here

# Optional: Alternative Backend
ALTERNATIVE_API_URL=https://your-custom-backend.com
ALTERNATIVE_API_KEY=your_api_key_here

Verify installation:

claudette --version    # Should output: 1.0.5
claudette status       # Check system status

📋 Requirements

Node.js: v18.0.0 or higher
npm: Latest version recommended
API Keys: At least one AI provider API key (OpenAI, Anthropic, etc.)
Operating System: Linux, macOS, Windows

💡 Why Use Claudette?

Without Claudette

// Locked into one expensive provider
const response = await openai.chat.completions.create({
  model: "gpt-4",
  messages: [{ role: "user", content: "Simple question" }]
});
// Cost: $0.03 per request, no failover, single provider dependency

With Claudette

// Intelligent routing across multiple providers
const response = await claudette.optimize("Simple question");
// Cost: $0.002 per request, automatic failover, best provider for each task

Result: Up to 95% cost reduction while maintaining quality and reliability.

💰 Claude Subscription Optimization Guide

Maximize your Claude Pro investment - From $20/month to enterprise-scale efficiency

🎯 Claude Pro ($20/month) - 5x More Value

With just a Claude Pro subscription, Claudette transforms your $20/month into powerful AI capabilities:

Without Claudette:

~500-1000 Claude Sonnet interactions/month
Single provider dependency
No cost optimization
Manual quality vs. cost decisions

With Claudette + Claude Pro:

// Smart routing maximizes your Claude Pro usage
const config = {
  claude: { 
    enabled: true, 
    priority: 1,        // Premium quality for important tasks
    model: "claude-3-sonnet-20240229" 
  },
  qwen: { 
    enabled: true, 
    priority: 2,        // Cost-effective for routine tasks
    cost_per_token: 0.0001 // 3x cheaper than Claude
  }
};

// Claudette automatically optimizes:
// - Complex analysis → Claude (premium quality)
// - Simple questions → Qwen (cost-effective)
// - Code explanations → Mixed routing based on complexity

Result: 2,500+ effective interactions/month for the same $20 budget!

🚀 Scaling with Additional APIs

Max100 Tier ($100/month equivalent)

Add OpenAI GPT-4o and local Ollama for comprehensive coverage:

const enterpriseConfig = {
  claude: { 
    enabled: true, 
    priority: 1,        // Creative writing, complex analysis
    cost_per_token: 0.0003 
  },
  openai: { 
    enabled: true, 
    priority: 2,        // Code generation, technical docs
    model: "gpt-4o-mini",
    cost_per_token: 0.0001 
  },
  qwen: { 
    enabled: true, 
    priority: 3,        // Research, summarization
    cost_per_token: 0.0001 
  },
  ollama: { 
    enabled: true, 
    priority: 4,        // Development, testing (FREE!)
    cost_per_token: 0,
    base_url: "http://localhost:11434"
  }
};

Smart Routing Strategy:

Creative Content → Claude Sonnet (premium quality)
Code Generation → GPT-4o (excellent for programming)
Research & Analysis → Qwen Plus (cost-effective, high quality)
Development & Testing → Ollama (free, local, private)

Economics:

$20 Claude Pro + $20 OpenAI + $0 Ollama = $40/month
10,000+ interactions/month with intelligent quality optimization
75% cost reduction vs. using Claude Pro exclusively

Max200 Enterprise Tier ($200/month equivalent)

Add premium models and specialized backends:

const maxConfig = {
  claude: { 
    enabled: true,
    model: "claude-3-opus-20240229",  // Premium model for critical tasks
    priority: 1 
  },
  openai: { 
    enabled: true,
    model: "gpt-4",                   // Full GPT-4 for complex reasoning
    priority: 2 
  },
  "claude-sonnet": {
    enabled: true,
    model: "claude-3-sonnet-20240229", // Mid-tier for balanced tasks
    priority: 3
  },
  qwen: { 
    enabled: true,
    model: "qwen-max",                // Premium Qwen for specialized tasks
    priority: 4 
  },
  mistral: {
    enabled: true,
    model: "mistral-large",           // European data compliance
    priority: 5
  },
  ollama: { 
    enabled: true,
    model: "codellama:34b",           // High-capacity local model
    priority: 6,
    cost_per_token: 0
  }
};

Use Case Optimization:

// Automatic task classification and routing
const examples = [
  {
    task: "Write marketing copy for product launch",
    routed_to: "claude-opus",     // Premium creativity
    cost: "$0.015 per request"
  },
  {
    task: "Generate unit tests for React component", 
    routed_to: "gpt-4",          // Excellent code understanding
    cost: "$0.006 per request"
  },
  {
    task: "Summarize research papers",
    routed_to: "qwen-max",       // Cost-effective, high accuracy
    cost: "$0.002 per request"
  },
  {
    task: "Code refactoring during development",
    routed_to: "ollama",         // Free, private, fast iteration
    cost: "$0.000 per request"
  }
];

Enterprise Benefits:

25,000+ interactions/month across all quality tiers
Specialized routing for different content types
Geographic compliance (EU data with Mistral)
Private development (local Ollama)
Cost transparency and budget controls

📊 ROI Comparison Table

Setup	Monthly Cost	Interactions	Cost/Interaction	Quality Mix
Claude Pro Only	$20	1,000	$0.020	High (Claude only)
Claudette + Claude Pro	$20	2,500	$0.008	High/Med (Smart routing)
Max100 (Multi-API)	$40	10,000	$0.004	Premium/High/Med
Max200 (Enterprise)	$200	25,000	$0.008	All tiers optimized

🎯 Smart Routing Examples

Content Creation Workflow

// Blog post creation - optimized routing
const workflow = [
  {
    step: "Research and outline",
    prompt: "Research trends in AI development",
    routed_to: "qwen",           // Cost-effective research
    cost: "$0.002"
  },
  {
    step: "Draft creation", 
    prompt: "Write engaging blog post from outline",
    routed_to: "claude-sonnet",  // Balanced quality/cost
    cost: "$0.008"
  },
  {
    step: "Final polish",
    prompt: "Enhance tone and add compelling examples",
    routed_to: "claude-opus",    // Premium quality finish
    cost: "$0.015"
  }
];

// Total cost: $0.025 vs $0.045 using Claude exclusively
// Savings: 44% while maintaining premium final quality

Development Workflow

// Software development - mixed routing
const devWorkflow = [
  {
    step: "Code iteration",
    routed_to: "ollama",         // Free local development
    cost: "$0.000",
    use: "Rapid prototyping, testing ideas"
  },
  {
    step: "Code review",
    routed_to: "gpt-4",          // Excellent code analysis
    cost: "$0.006",
    use: "Security review, best practices"
  },
  {
    step: "Documentation",
    routed_to: "claude-sonnet",  // Clear technical writing
    cost: "$0.008",
    use: "API docs, user guides"
  }
];

🔄 Migration Strategy

Phase 1: Start with Claude Pro

# Week 1-2: Basic optimization
claudette init --quick
# Configure Claude Pro + free Qwen API
# Immediate 2-3x interaction increase

Phase 2: Add Strategic APIs

# Week 3-4: Add OpenAI for code tasks
# Add local Ollama for development
# 5-8x effective capacity

Phase 3: Enterprise Optimization

# Month 2+: Full backend suite
# Specialized routing rules
# 10-20x capacity with quality optimization

💡 Pro Tips for Maximum Efficiency

Cache Strategy:

// 40% of requests hit cache = 40% cost reduction
const config = { 
  caching: true, 
  cache_ttl: 3600  // 1 hour cache
};

Quality Tiering:

// Route by complexity automatically
const rules = {
  simple_questions: "qwen",      // 70% of requests
  complex_analysis: "claude",    // 20% of requests  
  creative_content: "claude-opus" // 10% of requests
};

Development vs Production:

// Free development, optimized production
const environment = process.env.NODE_ENV;
const backend = environment === 'development' ? 'ollama' : 'claude';

🎯 Bottom Line Value Proposition

$20 Claude Pro → 2,500 interactions (vs 1,000 direct)
$40 Multi-API → 10,000 interactions with quality routing
$200 Enterprise → 25,000 interactions with premium options

Claudette pays for itself with the first month's optimization! 🚀

💸 Low-Cost Token Providers & Inference Services

Slash your AI costs by 80-95% - Access premium AI capabilities through budget-friendly providers

🏭 Enterprise-Grade Low-Cost Providers

Alibaba Cloud (Qwen) - 90% Cost Reduction

// Qwen through Alibaba Cloud DashScope
const config = {
  qwen: {
    enabled: true,
    base_url: "https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
    api_key: process.env.QWEN_API_KEY,
    model: "qwen-plus",
    cost_per_token: 0.0001,  // ~90% cheaper than Claude
    priority: 2
  }
};

// Get API access:
// 1. Sign up at https://dashscope.console.aliyun.com/
// 2. Activate DashScope service
// 3. Get API key from console
// 4. $3 free credits + pay-per-use pricing

Qwen Pricing (Alibaba Cloud):

Qwen-Plus: ¥0.0008/1K tokens (~$0.0001) - 20x cheaper than Claude
Qwen-Max: ¥0.02/1K tokens (~$0.003) - 10x cheaper than GPT-4
Qwen-Turbo: ¥0.0003/1K tokens (~$0.00004) - 75x cheaper than Claude

DeepSeek - Extremely Low Cost

const config = {
  deepseek: {
    enabled: true,
    base_url: "https://api.deepseek.com/v1",
    api_key: process.env.DEEPSEEK_API_KEY,
    model: "deepseek-chat",
    cost_per_token: 0.00002,  // 95% cheaper than premium models
    priority: 3
  }
};

// Get access at: https://platform.deepseek.com/
// $5 free credits, then $0.14/1M input tokens

Together AI - High Performance, Low Cost

const config = {
  together: {
    enabled: true,
    base_url: "https://api.together.xyz/v1",
    api_key: process.env.TOGETHER_API_KEY,
    model: "meta-llama/Llama-2-70b-chat-hf",
    cost_per_token: 0.0002,  // 85% cheaper than Claude
    priority: 4
  }
};

// Access: https://api.together.xyz/
// Multiple open-source models, competitive pricing

Groq - Ultra-Fast Inference

const config = {
  groq: {
    enabled: true,
    base_url: "https://api.groq.com/openai/v1",
    api_key: process.env.GROQ_API_KEY,
    model: "mixtral-8x7b-32768",
    cost_per_token: 0.00027,  // 80% cheaper + 10x faster
    priority: 5
  }
};

// Get free tier: https://console.groq.com/
// 100 requests/day free, then $0.27/1M tokens

🏠 Self-Hosted Solutions (FREE)

Ollama - Completely Free Local Inference

const config = {
  ollama: {
    enabled: true,
    base_url: "http://localhost:11434",
    model: "llama2:70b",
    cost_per_token: 0,  // 100% FREE!
    priority: 6
  }
};

// Setup Ollama locally:
// 1. Install: curl -fsSL https://ollama.ai/install.sh | sh
// 2. Run: ollama run llama2:70b
// 3. Free unlimited usage on your hardware

Recommended Ollama Models:

CodeLlama:34b - Excellent for code generation (FREE)
Mistral:7b - Fast general purpose (FREE)
Llama2:70b - High quality responses (FREE)
Neural-Chat:7b - Conversational AI (FREE)

LocalAI - Self-Hosted OpenAI Alternative

# Docker setup for free local inference
docker run -p 8080:8080 -v $PWD/models:/models -ti localai/localai:latest

# Configure in Claudette:
const config = {
  localai: {
    enabled: true,
    base_url: "http://localhost:8080/v1",
    model: "gpt-3.5-turbo",  // LocalAI model name
    cost_per_token: 0,  // FREE!
    priority: 7
  }
};

📊 Cost Comparison Table

Provider	Model	Cost/1M Tokens	vs Claude Pro	Quality	Speed
Claude Pro	claude-3-sonnet	$3.00	Baseline	Excellent	Fast
Qwen Plus	qwen-plus	$0.10	30x cheaper	Excellent	Fast
DeepSeek	deepseek-chat	$0.14	21x cheaper	Very Good	Fast
Groq Mixtral	mixtral-8x7b	$0.27	11x cheaper	Very Good	Ultra Fast
Together AI	llama-2-70b	$0.20	15x cheaper	Very Good	Fast
Ollama	llama2:70b	$0.00	∞ cheaper	Good	Medium
LocalAI	Various	$0.00	∞ cheaper	Varies	Medium

🎯 Smart Cost Optimization Strategy

Tier 1: Ultra-Budget Setup ($0-5/month)

const budgetConfig = {
  // Free tier: 80% of requests
  ollama: { 
    enabled: true, 
    priority: 1,
    cost_per_token: 0,
    use_cases: ["development", "testing", "simple_queries"]
  },
  
  // Low-cost tier: 15% of requests  
  qwen: { 
    enabled: true, 
    priority: 2,
    cost_per_token: 0.0001,
    use_cases: ["research", "analysis", "content_generation"]
  },
  
  // Premium tier: 5% of requests
  claude: { 
    enabled: true, 
    priority: 3,
    cost_per_token: 0.003,
    use_cases: ["critical_decisions", "final_review", "complex_reasoning"]
  }
};

// Result: 10,000+ interactions for $5/month
// vs 500 interactions with Claude Pro alone

Tier 2: Performance Setup ($10-20/month)

const performanceConfig = {
  // Speed layer: 40% of requests
  groq: { 
    enabled: true, 
    priority: 1,
    cost_per_token: 0.00027,
    use_cases: ["real_time_chat", "quick_responses"]
  },
  
  // Quality layer: 40% of requests
  qwen: { 
    enabled: true, 
    priority: 2,
    cost_per_token: 0.0001,
    use_cases: ["content_creation", "analysis"]
  },
  
  // Premium layer: 20% of requests
  claude: { 
    enabled: true, 
    priority: 3,
    cost_per_token: 0.003,
    use_cases: ["complex_tasks", "critical_content"]
  }
};

// Result: 25,000+ interactions for $20/month
// Premium quality with ultra-fast responses

🔧 Easy Setup Guide

1. Qwen (Alibaba Cloud) Setup

# Step 1: Get free Alibaba Cloud account
# Visit: https://www.alibabacloud.com/
# Sign up with email (no credit card required for trial)

# Step 2: Activate DashScope
# Go to: https://dashscope.console.aliyun.com/
# Click "Activate Service" (free tier included)

# Step 3: Get API Key
# Dashboard → API Keys → Create New Key
# Copy the API key

# Step 4: Configure Claudette
export QWEN_API_KEY="sk-your-qwen-key-here"
claudette setup-credentials

2. DeepSeek Setup

# Step 1: Register at https://platform.deepseek.com/
# $5 free credits, no credit card required

# Step 2: Generate API key
# API Keys → Create New Key

# Step 3: Add to Claudette
export DEEPSEEK_API_KEY="sk-your-deepseek-key"

3. Ollama Local Setup

# Install Ollama (one-time setup)
curl -fsSL https://ollama.ai/install.sh | sh

# Download a model (8GB+ RAM recommended)
ollama run llama2:7b    # Smaller model for testing
ollama run llama2:70b   # Larger model for production

# Verify it works
curl http://localhost:11434/api/generate -d '{
  "model": "llama2",
  "prompt": "Hello world"
}'

💡 Advanced Cost Optimization Tips

Geographic Arbitrage

// Use region-specific pricing
const asiaConfig = {
  qwen: {
    base_url: "https://dashscope-ap-southeast-1.aliyuncs.com/compatible-mode/v1",
    // Often 20-30% cheaper in Asia Pacific regions
  }
};

Batch Processing for Volume Discounts

// Process multiple requests together
const batchResults = await claudette.optimizeBatch([
  { prompt: "Question 1" },
  { prompt: "Question 2" },
  { prompt: "Question 3" }
], {
  backend: "qwen",  // Use lowest cost backend for batches
  batch_size: 10    // Optimize for volume pricing
});

Smart Caching for 50% Cost Reduction

const cacheConfig = {
  features: {
    caching: true,
    cache_ttl: 86400,  // 24 hour cache
    intelligent_cache: true  // Cache similar queries
  }
};

// Typical 40-60% cache hit rate = 40-60% cost savings

🎯 Real-World Savings Examples

Content Creator Workflow

// Before: Claude Pro only
// Cost: $50/month for 1,000 articles
// After: Smart routing
const contentWorkflow = {
  research: "qwen",        // $2/month for research
  draft: "deepseek",       // $3/month for drafts  
  polish: "claude",        // $10/month for final polish
  // Total: $15/month for same 1,000 articles
  // Savings: 70% ($35/month)
};

Development Team

// Before: GPT-4 for everything
// Cost: $200/month for team
// After: Tiered approach
const devWorkflow = {
  code_review: "groq",       // Ultra-fast, $5/month
  documentation: "qwen",     // High quality, $8/month
  architecture: "claude",    // Complex reasoning, $15/month
  prototyping: "ollama",     // Free local development
  // Total: $28/month vs $200/month
  // Savings: 86% ($172/month)
};

🏆 Best Practices for Maximum Savings

Start Free: Begin with Ollama for development and testing
Graduate Smart: Move to Qwen for production workloads
Premium Sparingly: Use Claude/GPT-4 only for critical tasks
Cache Aggressively: Enable caching for 40-60% cost reduction
Monitor Usage: Track costs and optimize routing rules
Batch Processing: Group similar requests for volume discounts

🔧 API Usage

Basic Backend Routing

import { Claudette } from 'claudette';

const claudette = new Claudette({
  openai: { apiKey: process.env.OPENAI_API_KEY },
  claude: { apiKey: process.env.ANTHROPIC_API_KEY }
});

// Automatic backend selection
const response = await claudette.optimize({
  prompt: "Explain quantum computing",
  max_tokens: 500
});

console.log(response.content);
console.log(`Backend used: ${response.backend_used}`);
console.log(`Cost: €${response.cost_eur}`);

System Status

// Check system status
const status = await claudette.getStatus();
console.log(`System Health: ${status.healthy ? 'Healthy' : 'Unhealthy'}`);
console.log(`Version: ${status.version}`);
console.log(`Cache Hit Rate: ${status.cache.hit_rate}`);

📖 Documentation

Core Documentation

API Reference - Complete API documentation
Configuration Guide - Setup and configuration
Architecture Overview - System design and components

Cost Optimization Guides

Claude Subscription Optimization - Maximize Claude Pro value
Low-Cost Token Providers - 80-95% cost reduction strategies
Smart Routing Configuration - Tiered backend setup

Development Resources

Configuration Examples - Sample configurations
TypeScript Types - Type definitions
Testing - Test examples and utilities

🤝 Contributing

Fork the repository
Create a feature branch: git checkout -b feature/amazing-feature
Test your changes: npm test
Commit changes: git commit -m 'Add amazing feature'
Push to branch: git push origin feature/amazing-feature
Open a Pull Request

Development Setup

git clone https://github.com/RobLe3/claudette.git
cd claudette
npm install
npm run test:comprehensive  # Run full test suite

📊 Current Version

✅ v1.0.5 (Current)

Backend Support: OpenAI, Claude, Qwen, Ollama, and custom backends
Advanced Memory Management: Pressure-based scaling with emergency cleanup
Ultra-Fast MCP Server: Sub-second startup (264ms) for Claude Code integration
Comprehensive Benchmarking: Performance validation across all interfaces
Harmonized Timeouts: Intelligent retry logic with circuit breakers
Monitoring: Performance metrics and health monitoring
Cost Tracking: Real-time cost calculation and budget management
Caching: Intelligent response caching system
TypeScript: Full type safety and modern development experience
CLI Tools: Interactive setup and management commands

🐛 Support & Issues

Issues: GitHub Issues
Documentation: docs/
License: MIT License

Claudette v1.0.5 - Advanced AI Backend Router & Cost Optimizer with Ultra-Fast MCP Integration

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
.github		.github
config		config
docs		docs
scripts		scripts
src		src
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
BENCHMARK_RESULTS.md		BENCHMARK_RESULTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDETTE.1		CLAUDETTE.1
CLAUDETTE_INTEGRITY_AUDIT_REPORT.md		CLAUDETTE_INTEGRITY_AUDIT_REPORT.md
CLAUDE_CODE_INTEGRATION_GUIDE.md		CLAUDE_CODE_INTEGRATION_GUIDE.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
HTTP_API_RECOVERY_REPORT.md		HTTP_API_RECOVERY_REPORT.md
LICENSE		LICENSE
README.md		README.md
benchmark-all.js		benchmark-all.js
benchmark-api.js		benchmark-api.js
benchmark-mcp.js		benchmark-mcp.js
benchmark-native.js		benchmark-native.js
benchmark-results-summary.json		benchmark-results-summary.json
claudette		claudette
claudette-mcp-multiplexer.js		claudette-mcp-multiplexer.js
claudette-mcp-server-fast.js		claudette-mcp-server-fast.js
claudette-mcp-server-fast.ts		claudette-mcp-server-fast.ts
claudette-mcp-server-optimized.js		claudette-mcp-server-optimized.js
claudette-mcp-server-unified.js		claudette-mcp-server-unified.js
claudette-mcp-server.js		claudette-mcp-server.js
cleanup-action-plan.json		cleanup-action-plan.json
comprehensive-benchmark-report.md		comprehensive-benchmark-report.md
docker-compose.yml		docker-compose.yml
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

License

RobLe3/claudette

Folders and files

Latest commit

History

Repository files navigation