🚀 Production-grade multi-agent AI system for incident analysis, root cause detection, and knowledge extraction across internal engineering data
- 🤖 Multi-Agent Reasoning System using LangGraph
- 🔍 Hybrid Retrieval (ChromaDB + Web Search)
- 🎯 Cross-Encoder Reranking for high precision
- 🧠 Context-Aware Retrieval using ParentDocumentRetriever
- 🔁 Validation Layer to reduce hallucinations
- ⚡ Async FastAPI Backend with real-time status tracking
Engineering teams struggle with:
- Scattered incident data (Jira, Slack, Docs)
- Slow root cause analysis
- Lack of contextual reasoning
👉 MEKA solves this by:
- Aggregating knowledge sources
- Running multi-agent reasoning workflows
- Producing structured, validated answers
User Query
↓
Planner Agent (Query Decomposition)
↓
Retriever Agent
├── Vector DB (ChromaDB)
└── Web Search (DuckDuckGo)
↓
Reranker Agent (Cross-Encoder)
↓
Summarizer Agent (LLM - Ollama)
↓
Validator Agent (Hallucination Check)
↓
Final Answer + Reasoning Trace
| Agent | Responsibility |
|---|---|
| Planner | Query decomposition + routing |
| Retriever | Hybrid search (vector + web) |
| Reranker | Context scoring using cross-encoder |
| Summarizer | Generate final answer |
| Validator | Ensure factual alignment |
- Vector DB: ChromaDB (local persistence)
- Embeddings: BAAI/bge-small-en-v1.5
- Retriever: ParentDocumentRetriever (deep context)
- Web Search: DuckDuckGo integration
👉 Combines internal + external knowledge
-
Cross-Encoder (
ms-marco-MiniLM-L-6-v2)- Improves top-k precision significantly
-
Validation Layer
- Detects hallucination risk
- Ensures answer grounded in context
| Layer | Technology |
|---|---|
| Orchestration | LangGraph |
| Backend | FastAPI (Async) |
| Frontend | React + Vite |
| Vector DB | ChromaDB |
| LLM | Ollama (Llama3 / Mistral) |
| Embeddings | BGE-small |
| Search | DuckDuckGo |
Submit query asynchronously
Track agent execution + reasoning trace
Retrieve past queries
-
ChromaDB
- Lightweight, no external infra required
-
ParentDocumentRetriever
- Preserves long-context understanding
-
Cross-Encoder
- Higher precision vs bi-encoder
-
Local LLM (Ollama)
- Privacy-focused
- No external API dependency
- Higher latency due to reranking
- Local LLM depends on hardware
- Sequential scoring impacts speed
👉 Optimized for accuracy over speed
-
Generated synthetic:
- Jira incidents
- Slack threads
- Confluence docs
-
Example queries:
Summarize all Kubernetes outages caused by misconfiguration
Identify incidents related to OAuth token leakage
backend/
frontend/
agents/
scripts/
Raj Kalash Tiwari GitHub: https://github.com/rjkalash
✅ Advanced multi-agent RAG system ⚡ Designed for enterprise knowledge systems
⭐ Star this repo if you found it useful!