Production-ready AI agents for automated incident analysis, root cause detection, and intelligent remediation.
ADAPT-Agents is an open-source alternative to DataDog, Grafana, and PagerDuty's RCA capabilities, offering real-time streaming, RAG-powered learning, enterprise integrations, and interactive visualizationsโall at zero cost.
Enterprise features that rival commercial observability platforms:
- โก Async/Await Architecture - 3-5x faster parallel execution
- ๐ง RAG & Historical Learning - Learn from every incident with ChromaDB + sentence-transformers
- ๐ Real-Time Streaming - WebSocket-based live updates during analysis
- ๐ Enterprise Integrations - Native Slack, JIRA, PagerDuty connectors
- ๐ Interactive Visualizations - Root cause graphs, timelines, metrics dashboards
- ๐ฏ Intelligent Agents - 6 specialized AI agents with LLM integration
- ๐ก Webhook Management - Event-driven callbacks with delivery tracking
- ๐๏ธ Vector Database - Semantic search for similar historical incidents
- ๐ API Authentication - API key auth + rate limiting + request tracking
- ๐พ Persistent Storage - SQLite backend for analyses, webhooks, integrations
- ๐จ Multi-Format Export - Cytoscape.js, D3.js, Plotly, Chart.js, GraphML, DOT
| Feature | DataDog | Grafana | PagerDuty | ADAPT-Agents |
|---|---|---|---|---|
| Real-Time Streaming | โ | โ | โ | โ |
| RAG/AI Learning | โ | โ | โ | โ |
| Slack/JIRA/PagerDuty | โ | โ | Native | โ |
| Interactive Viz | โ | โ | โ | โ |
| Open Source | โ | โ | โ | โ |
| Monthly Cost | $15-31/host | Free | $21-51/user | $0 |
Result: Enterprise-grade RCA platform at zero cost, with unique AI learning capabilities.
# Clone repository
git clone https://github.com/yourusername/ADAPT-Agents.git
cd ADAPT-Agents
# Start complete stack (API + Redis + Prometheus + Grafana)
docker-compose up -d
# Access:
# - API & Docs: http://localhost:8000/docs
# - Metrics: http://localhost:9090
# - Grafana: http://localhost:3000# Install all features
pip install -r requirements.txt
# Or install selectively
pip install fastapi uvicorn httpx websockets # Core API
pip install chromadb sentence-transformers # RAG features
pip install networkx # Visualizations
pip install openai anthropic # LLM providers# Create .env file
cp .env.example .env
# Configure LLM provider (required for AI features)
echo "ADAPT_LLM_PROVIDER=openai" >> .env
echo "ADAPT_LLM_API_KEY=sk-..." >> .env# Analyze incident using CLI
python -m cli.main analyze examples/incident_data.json
# Output: Root cause + remediation plan in seconds# Start FastAPI server
uvicorn api.server:app --reload
# Access interactive docs at http://localhost:8000/docs# Create analysis (returns immediately with job ID)
curl -X POST http://localhost:8000/analyze \
-H "X-API-Key: demo-key-12345" \
-H "Content-Type: application/json" \
-d @examples/incident_data.json
# Get results
curl http://localhost:8000/analyze/{analysis_id} \
-H "X-API-Key: demo-key-12345"// Connect to WebSocket for live updates
const ws = new WebSocket('ws://localhost:8000/ws/analysis/{analysis_id}');
ws.onmessage = (event) => {
const update = JSON.parse(event.data);
console.log(`[${update.agent_name}] ${update.status}: ${update.message}`);
};# Configure Slack integration
curl -X POST http://localhost:8000/api/v1/integrations/slack \
-H "X-API-Key: demo-key-12345" \
-d '{"webhook_url": "https://hooks.slack.com/services/..."}'
# Send incident alert to Slack
curl -X POST http://localhost:8000/api/v1/integrations/notify/incident \
-H "X-API-Key: demo-key-12345" \
-d '{
"incident_id": "inc-123",
"incident_data": {...},
"slack_channel": "#incidents",
"create_jira": true,
"trigger_pagerduty": true
}'# Generate root cause dependency graph
curl -X POST http://localhost:8000/api/v1/visualizations/root-cause-graph \
-H "X-API-Key: demo-key-12345" \
-d '{"rca_results": {...}, "format": "cytoscape"}'
# Generate complete dashboard (all visualizations)
curl -X POST http://localhost:8000/api/v1/visualizations/complete-dashboard \
-H "X-API-Key: demo-key-12345" \
-d '{"incident_data": {...}, "rca_results": {...}}'# Store incident in knowledge base (automatic after each analysis)
# Search for similar past incidents
curl -X POST http://localhost:8000/api/v1/knowledge-base/search/similar-incidents \
-H "X-API-Key: demo-key-12345" \
-d '{
"incident_data": {...},
"n_results": 5,
"similarity_threshold": 0.7
}'
# Get insights about recurring patterns
curl http://localhost:8000/api/v1/knowledge-base/incidents/{id}/insights \
-H "X-API-Key: demo-key-12345"โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ ADAPT-Agents v3.5 Platform โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โ
โ โ WebSockets โ โ Webhooks โ โ REST API โ โ
โ โ (Real-Time) โ โ (Callbacks) โ โ (FastAPI) โ โ
โ โโโโโโโโฌโโโโโโโโ โโโโโโโโฌโโโโโโโโ โโโโโโโโฌโโโโโโโโ โ
โ โ โ โ โ
โ โโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโ โ
โ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Integration Layer โ โ
โ โ โโโโโโโโโโโ โโโโโโโโโโโ โโโโโโโโโโโโโโโโ โ โ
โ โ โ Slack โ โ JIRA โ โ PagerDuty โ โ โ
โ โ โโโโโโโโโโโ โโโโโโโโโโโ โโโโโโโโโโโโโโโโ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Streaming Agent Orchestrator โ โ
โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โ โ Phase 1: Parallel Diagnostic Agents โ โ โ
โ โ โ [Log] [Metrics] [Changes] [Topology] โ โ โ
โ โ โโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โ โ โ โ
โ โ โโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โ โ Phase 2: Hypothesis Generation (RAG) โ โ โ
โ โ โ [Similar Incidents] โ [LLM] โ [Hypoths] โ โ โ
โ โ โโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โ โ โ โ
โ โ โโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โ โ Phase 3: Remediation Planning (RAG) โ โ โ
โ โ โ [Past Solutions] โ [LLM] โ [Actions] โ โ โ
โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Intelligence Layer โ โ
โ โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โ โ ChromaDB โ โ sentence-transformers โ โ โ
โ โ โ (Vectors) โ โ (Embeddings) โ โ โ
โ โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Visualization Layer โ โ
โ โ [Graphs] [Timelines] [Dashboards] [Metrics] โ โ
โ โ โ Cytoscape, D3.js, Plotly, Chart.js โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Persistence Layer โ โ
โ โ [SQLite] [Redis Cache] [Vector DB] โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
| Agent | Purpose | LLM-Powered | Capabilities |
|---|---|---|---|
| LogAnalyzerAgent | Error pattern detection | โ | Pattern matching, cascade detection, severity scoring |
| MetricsAnalyzerAgent | Anomaly detection | โ | Statistical analysis, threshold violations, correlations |
| ChangeCorrelatorAgent | Change-incident correlation | โ | Temporal correlation, risk scoring, deployment analysis |
| TopologyInferenceAgent | Service dependency mapping | โ | Dependency graphs, impact analysis, blast radius |
| HypothesisGeneratorAgent | Root cause synthesis | โ | Multi-agent fusion, RAG enhancement, confidence scoring |
| RemediationPlannerAgent | Action planning | โ | Prioritization, time estimation, validation tests |
All agents support:
- Async/await execution
- Result caching (Redis/Memory)
- Structured output (Pydantic models)
- RAG enhancement with historical context
- PII filtering for sensitive data
- Prometheus metrics tracking
POST /analyze- Start RCA analysis (async)GET /analyze/{id}- Get analysis resultsGET /agents- List all available agentsPOST /agents/{name}/execute- Run specific agent
ws://host/ws/analysis/{id}- Live updates for specific analysisws://host/ws/broadcast- System-wide event streamws://host/ws/agent/{name}- Agent-specific updates
POST /api/v1/webhooks- Create webhook subscriptionGET /api/v1/webhooks- List webhooksDELETE /api/v1/webhooks/{id}- Delete webhookGET /api/v1/webhooks/{id}/deliveries- Delivery history
POST /api/v1/knowledge-base/incidents- Store incidentPOST /api/v1/knowledge-base/search/similar-incidents- Find similar incidentsGET /api/v1/knowledge-base/incidents/{id}/insights- Get insightsGET /api/v1/knowledge-base/stats- Database statistics
POST /api/v1/integrations/slack- Configure SlackPOST /api/v1/integrations/jira- Configure JIRAPOST /api/v1/integrations/pagerduty- Configure PagerDutyPOST /api/v1/integrations/notify/incident- Send alertsPOST /api/v1/integrations/notify/rca-complete- Send RCA summary
POST /api/v1/visualizations/root-cause-graph- Generate dependency graphPOST /api/v1/visualizations/timeline- Generate incident timelinePOST /api/v1/visualizations/metrics-dashboard- Generate metrics dashboardPOST /api/v1/visualizations/complete-dashboard- Generate all visualizations
Full API documentation: http://localhost:8000/docs
# Server-side: Streaming orchestrator automatically sends updates
from chains.streaming_orchestrator import StreamingOrchestrator
orchestrator = StreamingOrchestrator(
websocket_manager=ws_manager,
analysis_id=analysis_id
)
# Automatically streams:
# - Agent start/complete events
# - Individual findings as discovered
# - Phase transitions
# - Final results# Automatic: Every successful analysis is stored in ChromaDB
# Future analyses get historical context automatically
# Manual similarity search:
from rag import SimilaritySearchService
similar = similarity_search.find_similar_incidents(
query_incident=current_incident,
n_results=5,
similarity_threshold=0.7
)
# Returns: Top-5 similar past incidents with RCA solutions# Configure once, use everywhere
from integrations import IntegrationManager
manager = IntegrationManager()
# Slack
manager.register_slack(integration_id, api_key, webhook_url)
# JIRA
manager.register_jira(integration_id, api_key, jira_url, username, token, project_key)
# PagerDuty
manager.register_pagerduty(integration_id, api_key, pd_api_key, integration_key)
# Notify all configured integrations
await manager.notify_incident(incident_id, incident_data, api_key)# Generate root cause dependency graph
from visualization import RootCauseGraphGenerator
graph_gen = RootCauseGraphGenerator()
graph_data = graph_gen.generate_from_rca(rca_results)
# Export in multiple formats:
cytoscape_format = graph_data["cytoscape"] # For web rendering
d3_format = graph_data["d3"] # For force-directed graph
graphml_format = graph_data["graphml"] # For analysis tools
dot_format = graph_data["dot"] # For Graphviz- Agent Guide - Detailed guide for each agent
- Orchestration Patterns - Chain composition strategies
- LLM Integration - Using OpenAI, Anthropic, etc.
- Real-Time Streaming - WebSocket-based live updates
- Webhook Management - Event-driven callbacks
- RAG & Historical Learning - ChromaDB + semantic search
- Enterprise Integrations - Slack, JIRA, PagerDuty
- Interactive Visualizations - Graphs, timelines, dashboards
- REST API - Complete endpoint documentation
- WebSocket API - Real-time streaming protocols
- Pydantic Models - Data structures and validation
- Architecture - System design and extensibility
- Performance Tuning - Optimization strategies
- Security - Authentication, PII filtering, audit logs
- Deployment - Docker, Kubernetes, cloud platforms
ADAPT-Agents/
โโโ agents/ # 6 specialized diagnostic agents
โโโ chains/ # Orchestrators (sync, async, streaming)
โโโ schemas/ # Pydantic models and base classes
โโโ llm/ # LLM provider integrations
โโโ utils/ # Caching, logging, metrics, PII filtering
โโโ api/ # FastAPI server + routes
โ โโโ server.py # Main FastAPI app (v3.5)
โ โโโ websocket_routes.py # Real-time streaming endpoints
โ โโโ webhook_routes.py # Webhook management
โ โโโ knowledge_base_routes.py # RAG endpoints
โ โโโ integrations_routes.py # Enterprise integrations
โ โโโ visualization_routes.py # Interactive visualizations
โโโ rag/ # RAG & vector database
โ โโโ vector_db_manager.py # ChromaDB persistence
โ โโโ incident_embeddings.py # Sentence-BERT embeddings
โ โโโ similarity_search.py # Semantic search
โ โโโ rag_enhancer.py # LLM prompt enhancement
โโโ integrations/ # Enterprise connectors
โ โโโ slack.py # Slack integration
โ โโโ jira.py # JIRA integration
โ โโโ pagerduty.py # PagerDuty integration
โ โโโ integration_manager.py # Unified manager
โโโ visualization/ # Interactive charts & graphs
โ โโโ root_cause_graph.py # Dependency graphs
โ โโโ timeline_chart.py # Incident timelines
โ โโโ metrics_dashboard.py # Metrics visualization
โโโ cli/ # Command-line interface
โโโ config/ # Configuration management
โโโ examples/ # Usage examples
โโโ tests/ # Test suite (80%+ coverage)
โโโ docs/ # Documentation
โโโ docker/ # Docker & Kubernetes configs
# Run all tests
pytest
# With coverage
pytest --cov=. --cov-report=html
# Specific test file
pytest tests/unit/test_async_orchestrator.py
# Integration tests
pytest tests/integration/# Format code
black .
# Sort imports
isort .
# Lint
flake8 .
# Type checking
mypy .# Start complete stack
docker-compose up -d
# Services included:
# - API server (port 8000)
# - Redis cache (port 6379)
# - Prometheus metrics (port 9090)
# - Grafana dashboards (port 3000)
# - ChromaDB vector database (embedded)# Deploy to Kubernetes
kubectl apply -f k8s/
# Includes:
# - API deployment (3 replicas)
# - Redis StatefulSet
# - Prometheus monitoring
# - Ingress configuration- AWS: ECS Fargate + ElastiCache + RDS
- GCP: Cloud Run + Memorystore + Cloud SQL
- Azure: Container Apps + Redis + PostgreSQL
See deployment guide for detailed instructions.
We welcome contributions! Please see CONTRIBUTING.md for guidelines.
- Fork the repository
- Create feature branch (
git checkout -b feature/amazing-feature) - Commit changes (
git commit -m 'Add amazing feature') - Push to branch (
git push origin feature/amazing-feature) - Open Pull Request
- Extend
BaseAgentorAsyncBaseAgent - Implement
execute()orexecute_async()method - Create prompt template in
prompts/ - Add tests in
tests/unit/ - Update documentation
MIT License - see LICENSE for details.
If you find ADAPT-Agents useful, please consider starring the repository!
- Documentation: docs/
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Discord: Join our community
Built with:
- FastAPI - Modern web framework
- ChromaDB - Vector database
- sentence-transformers - Embeddings
- NetworkX - Graph analysis
- Pydantic - Data validation
- OpenAI / Anthropic - LLM providers
Upcoming in v3.6:
- ๐ฎ ML-based anomaly detection with Prophet/ARIMA
- ๐งช Predictive analytics for incident prevention
- ๐ Multi-tenancy support
- ๐ฑ Mobile-friendly dashboards
- ๐ Bi-directional integration sync
See ROADMAP.md for the complete roadmap.
| Feature | DataDog | ADAPT-Agents |
|---|---|---|
| RCA Analysis | โ | โ |
| Real-Time Streaming | โ | โ |
| AI/ML Learning | Limited | โ RAG + ChromaDB |
| Integrations | 500+ | Slack/JIRA/PD + extensible |
| Cost | $15-31/host/mo | FREE |
| Self-Hosted | โ | โ |
| Feature | Grafana | ADAPT-Agents |
|---|---|---|
| Dashboards | โ | โ |
| Alerting | โ | โ via integrations |
| RCA Automation | Plugins | โ Native |
| AI-Powered | โ | โ |
| Cost | Free | FREE |
| Feature | PagerDuty | ADAPT-Agents |
|---|---|---|
| Incident Management | โ | โ via integrations |
| RCA Automation | AIOps | โ Native + RAG |
| Cost | $21-51/user/mo | FREE |
| Customizable | Limited | โ Full control |
Winner: ADAPT-Agents offers enterprise features at zero cost with unique AI learning capabilities.
Built with โค๏ธ by the open-source community. Join us in revolutionizing incident management!