AI Voice Agent System

A production-ready AI voice agent system enabling intelligent phone conversations using advanced language models, speech processing, and integrated telephony services.

🚀 Quick Start

Prerequisites

Python 3.8+
PostgreSQL 17+
Redis 6+
Virtual environment

Installation

Clone and setup:

git clone <repository-url>
cd voice
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
pip install -r requirements.txt

Configure environment:

cp .env.example .env
# Edit .env with your API keys and database credentials

Run the application:

python main.py

The application will start on http://0.0.0.0:8001 with all services initialized.

✨ Key Features

🗣️ Intelligent Conversations

Natural language understanding and generation
Context-aware responses with conversation memory
Multi-turn dialogue management
Personality customization per agent

☎️ Advanced Telephony

Multi-Provider Support: Ringover integration with extensible provider architecture
Inbound/Outbound Calls: Complete call lifecycle management
Real-time Audio Streaming: Low-latency WebSocket audio processing
Call State Management: Comprehensive call monitoring and control
Webhook Integration: Real-time event processing from telephony providers

🏗️ Integrated Architecture

✅ All-in-One FastAPI Application

Integrated Ringover Streamer: No external processes needed
Unified Service Management: All services managed by startup manager
Single Deployment: One application, all features included

🧠 Multi-Provider AI Support

LLM Providers: OpenAI, Anthropic, Google AI, Custom APIs
STT Providers: Whisper, Google Speech, Azure Speech
TTS Providers: ElevenLabs, OpenAI TTS, Google TTS, Azure Speech
Dynamic provider selection and failover

🔄 Real-time Processing

WebSocket-based audio streaming
Low-latency speech processing pipeline
Concurrent call handling (up to 100 simultaneous calls)
Async/await architecture for optimal performance

📊 Monitoring & Management

Real-time call status and metrics
Agent performance monitoring
Database-backed persistent storage
Comprehensive logging and error handling

🏗️ Current Architecture

System Status

✅ PRODUCTION READY - All services integrated and functional

Integrated Components

┌─────────────────────────────────────────────────────────────┐
│                    FastAPI Application                      │
│  ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐    │
│  │   Ringover  │ │  WebSocket  │ │    AI Services      │    │
│  │  Streaming  │ │ Orchestrator│ │  LLM/STT/TTS/Audio  │    │
│  │ (Integrated)│ │             │ │                     │    │
│  └─────────────┘ └─────────────┘ └─────────────────────┘    │
│  ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐    │
│  │ API Routes  │ │   Config    │ │    Data Layer       │    │
│  │   /api/v1   │ │  Registry   │ │ PostgreSQL + Redis  │    │
│  └─────────────┘ └─────────────┘ └─────────────────────┘    │
└─────────────────────────────────────────────────────────────┘

Service Initialization

All services are managed by the startup manager and initialize automatically:

Database: PostgreSQL connection and verification
Redis: Cache and session storage
Telephony: Ringover API integration with call management
LLM: Multi-provider language model orchestration
Audio: Speech processing pipeline
WebSocket: Real-time communication handlers
Monitoring: Application metrics and health checks
Ringover Streaming: Integrated audio streaming service

📁 Project Structure

Following strict file organization principles with maximum modularity:

voice/
├── api/v1/                      # API endpoints (modular)
│   ├── admin/                   # Administrative endpoints
│   ├── agents/                  # Agent management API
│   ├── calls/                   # Call management API
│   ├── streaming/ringover.py    # Integrated streaming endpoints
│   └── webhooks/ringover/       # Webhook event handlers
├── core/                        # Core system (deeply modular)
│   ├── config/
│   │   └── providers/ringover/  # Ringover config broken down by feature
│   │       ├── api.py           # API configuration
│   │       ├── webhook.py       # Webhook configuration  
│   │       ├── streaming.py     # Streaming configuration
│   │       └── config.py        # Combined configuration
│   ├── logging/
│   │   ├── config/              # Logging configuration factory
│   │   └── format/              # Color codes and formatters
│   └── startup/
│       ├── services/            # Individual service startups
│       │   └── telephony.py     # Telephony service initialization
│       ├── shutdown/            # Graceful shutdown handling
│       └── lifespan/            # FastAPI lifespan management
├── services/                    # Business logic (one feature per file)
│   ├── ringover/                # Telephony provider integration
│   │   ├── api.py               # Ringover API client
│   │   ├── client.py            # Core client implementation
│   │   ├── integration.py       # Integration orchestrator
│   │   └── streaming/           # Integrated streaming service
│   │       └── integrated.py    # Main streaming implementation
│   ├── call/                    # Call management services
│   │   ├── manager.py           # Call lifecycle management
│   │   ├── initiation/          # Call initiation logic
│   │   └── management/          # Call state management
│   ├── llm/providers/           # Individual LLM providers
│   ├── audio/streaming/         # Audio processing modules
│   └── stt/                     # Speech-to-text services
├── models/external/             # External API data models
│   ├── ringover/                # Ringover-specific models
│   └── llm/                     # LLM provider models
└── docs/                        # Comprehensive documentation
    ├── services/ringover/       # Service-specific docs
    ├── databases/               # Database documentation
    └── llm/                     # AI service documentation

🔧 Configuration

Environment Variables

The system uses a comprehensive .env configuration:

# Application
APP_ENV=development
API_PORT=8001

# Database
DATABASE_URL=postgresql+asyncpg://user:pass@localhost:5432/voice

# Redis
REDIS_URL=redis://localhost:6379

# Ringover Integration
RINGOVER_API_KEY=your_api_key
RINGOVER_API_BASE_URL=https://public-api.ringover.com/v2.1/
RINGOVER_WEBHOOK_SECRET=your_webhook_secret

# AI Services
OPENAI_API_KEY=your_openai_key
ANTHROPIC_API_KEY=your_anthropic_key
ELEVENLABS_API_KEY=your_elevenlabs_key

# Logging
LOG_LEVEL=INFO
LOG_DIR=/tmp/voice_agent_logs

🚀 Getting Started

System Requirements

Python 3.8+
PostgreSQL 17+
Redis 6+
4GB RAM minimum
10GB disk space

Installation Steps

Environment Setup:

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Database Setup:

# Create PostgreSQL database
createdb voice

# Run migrations (if any)
# python migrate.py

Configuration:

cp .env.example .env
# Edit .env with your credentials

Start Application:

There are several ways to run the application:

Option 1: Using the run script (Easiest)

# Development mode (default)
python run.py

# Production mode with workers
python run.py --workers 4 --no-reload --log-level warning

# Custom host/port
python run.py --host 127.0.0.1 --port 8080

# See all options
python run.py --help

Option 2: Direct Python execution (Development)

python main.py

Option 3: Using uvicorn directly (Recommended)

uvicorn main:app --host 0.0.0.0 --port 8001 --reload

Option 4: Using uvicorn with custom configuration

# Development with auto-reload
uvicorn main:app --host 0.0.0.0 --port 8001 --reload --log-level info

# Production-like setup
uvicorn main:app --host 0.0.0.0 --port 8001 --workers 1 --log-level warning

Option 5: Using uvicorn with environment file

uvicorn main:app --host 0.0.0.0 --port 8001 --env-file .env --reload

Application runs on: http://0.0.0.0:8001

Running Options Explained

python run.py: Custom script with sensible defaults and easy configuration
python main.py: Uses the built-in uvicorn configuration with reload enabled
uvicorn main:app: Standard ASGI server approach, more control over configuration
--reload: Automatically restarts server when code changes (development only)
--workers N: Run multiple worker processes (production, don't use with --reload)
--env-file .env: Explicitly load environment variables from .env file

✅ Current Status

What's Working

✅ All Services Integrated: No external dependencies
✅ FastAPI Server: Running on port 8001
✅ Database: PostgreSQL connected and verified
✅ Redis: Cache layer operational
✅ Telephony: Ringover API integration with proper URL handling
✅ LLM: OpenAI integration (needs valid API key for full functionality)
✅ Streaming: Ringover streamer fully integrated into FastAPI
✅ WebSocket: Real-time communication ready
✅ Monitoring: Application health monitoring active
✅ Logging: Comprehensive logging to /tmp/voice_agent_logs
✅ Graceful Shutdown: Clean startup and shutdown lifecycle

Service Startup Time

Complete initialization: ~7-9 seconds

🚀 Starting application initialization...
⏳ Initializing database... ✅ (0.2s)
⏳ Initializing redis... ✅ (0.1s)  
⏳ Initializing telephony... ✅ (0.5s)
⏳ Initializing llm... ✅ (6.0s)
⏳ Initializing audio... ✅ (0.1s)
⏳ Initializing websocket... ✅ (0.1s)
⏳ Initializing monitoring... ✅ (0.1s)
⏳ Initializing ringover... ✅ (0.1s)
✅ Application startup completed!

📚 Documentation

Comprehensive documentation is organized by feature:

Service Documentation - Individual service guides
Ringover Integration - Streaming service details
Database Setup - Database configuration and schemas
LLM Configuration - AI service setup and usage
API Reference - Endpoint documentation
Testing Guide - Testing organization and runners

🧪 Testing

Run comprehensive tests:

# All tests
python tests.py

# Service-specific tests  
python services/ringover/tests/runner.py
python services/agent/tests/runner.py
python api/tests/runner.py

🔄 Development

File Organization Principles

This project follows strict modularity principles:

Maximum Folder Depth: Each concept gets its own subfolder
Lowercase Names: All files and folders use lowercase, no underscores
Single Responsibility: One feature per file, files stay short
Focused Modules: Each file handles one specific functionality

Adding New Features

When adding features, follow the existing structure:

# Good: Focused, modular structure
services/newfeature/component/logic.py
services/newfeature/component/config.py

# Avoid: Monolithic files
services/newfeature_service.py  # Wrong naming
services/bigfile.py             # Too broad

Production-Ready AI Voice Agent System - Fully integrated, modular, and scalable. │ ├── llm/ # Language model integration │ ├── notification/ # Alert and notification systems │ ├── ringover/ # Telephony integration │ ├── stt/ # Speech-to-text services

🎯 Usage Examples

Creating an AI Agent

import httpx

# Create agent configuration
agent_config = {
    "name": "Customer Support Agent",
    "description": "Handles customer inquiries and support",
    "llm_provider": "openai",
    "llm_model": "gpt-4",
    "tts_provider": "elevenlabs",
    "tts_voice_id": "21m00Tcm4TlvDq8ikWAM",
    "personality": {
        "tone": "friendly",
        "style": "professional",
        "expertise": "customer service"
    }
}

# Create the agent
async with httpx.AsyncClient() as client:
    response = await client.post(
        "http://localhost:8000/api/v1/agents",
        json=agent_config
    )
    agent = response.json()
    print(f"Created agent: {agent['agent_id']}")

Making an Outbound Call

# Initiate outbound call
call_config = {
    "phone_number": "+1234567890",
    "agent_id": "agent_123",
    "caller_id": "+0987654321",
    "context": {
        "customer_name": "John Doe",
        "account_id": "ACC123",
        "purpose": "follow_up"
    }
}

async with httpx.AsyncClient() as client:
    response = await client.post(
        "http://localhost:8000/api/v1/calls/outbound",
        json=call_config
    )
    call = response.json()
    print(f"Call initiated: {call['call_id']}")

🚀 Deployment

Docker Deployment

# Dockerfile included in project
docker build -t voice-agent-system .
docker run -p 8000:8000 -p 8080:8080 --env-file .env voice-agent-system

Production Considerations

Database: Use managed PostgreSQL (AWS RDS, Google Cloud SQL)
Redis: Use managed Redis (AWS ElastiCache, Redis Cloud)
Load Balancing: Use nginx or cloud load balancers
SSL/TLS: Enable HTTPS for all API endpoints
Monitoring: Set up Prometheus/Grafana for metrics
Logging: Configure centralized logging (ELK stack)

Scaling

Horizontal Scaling: Deploy multiple instances behind a load balancer
Database Scaling: Use read replicas for analytics workloads
Redis Scaling: Use Redis Cluster for high availability
AI Services: Consider local GPU instances for reduced latency

📊 Monitoring & Observability

Health Checks

# System health
curl http://localhost:8000/health

# Database health
curl http://localhost:8000/api/v1/admin/database/health

# Call system status
curl http://localhost:8000/api/v1/calls/system/info

Metrics & Logging

Application logs: Structured JSON logging with configurable levels
Call metrics: Duration, success rates, AI response times
System metrics: CPU, memory, database connections
Business metrics: Agent utilization, call volume, customer satisfaction

🔒 Security

Authentication & Authorization

JWT-based API authentication
Role-based access control (RBAC)
Webhook signature verification
API rate limiting and throttling

Data Protection

Encrypted database connections
Secure credential storage
Call recording encryption
GDPR compliance features

🤝 Contributing

We welcome contributions! Please follow these steps:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Follow file organization principles (lowercase, modular, single-responsibility)
Add comprehensive tests
Update documentation
Submit a pull request

Development Standards

Modular Architecture: Keep files short and focused on single features
Naming Conventions: Lowercase names, maximum folder depth, no underscores
Testing: Comprehensive test coverage for all services
Documentation: Update relevant documentation in docs/
Code Quality: Follow type hints and async patterns

🐛 Troubleshooting

Common Issues

Database Connection Failed

# Check PostgreSQL status
sudo systemctl status postgresql

API Authentication Errors

Verify API keys in .env file
Check API quota limits
Ensure proper environment variable loading

Service Startup Issues

Check logs in /tmp/voice_agent_logs/
Verify all required environment variables are set
Ensure PostgreSQL and Redis are running

📄 License

MIT License - see LICENSE file for details.

🙏 Acknowledgments

Ringover - Telephony infrastructure
OpenAI - GPT and Whisper APIs
FastAPI - Web framework
PostgreSQL - Database layer

Ready for Production 🚀

This AI Voice Agent System is production-ready with:

✅ Fully integrated services
✅ Comprehensive error handling
✅ Graceful shutdown procedures
✅ Modular, maintainable architecture
✅ Extensive documentation and testing

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
.github/instructions		.github/instructions
api		api
core		core
data		data
docs		docs
models		models
rest		rest
services		services
tests		tests
tools		tools
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt
response.json		response.json
run.py		run.py
test_service_access.py		test_service_access.py

fescii/voice-agent

Folders and files

Latest commit

History

Repository files navigation