A reference architecture for an Agentic RAG project. It separates:
- domain – pure, testable functions (business logic)
- ports – small interfaces/abstractions the domain depends on
- adapters – concrete implementations (LangChain, FAISS, CrewAI, etc.)
- app – orchestration/use-cases
- container – a single composition root for dependency injection (DI)
This keeps the core independent from frameworks and vendors, improves testability, and makes swapping LLMs/vector stores/tooling trivial.
- Python 3.10–3.12
- macOS/Linux/WSL recommended
- (Optional) VS Code + Python extension if you like running notebooks/Debug
# 1) Create and activate a virtual environment
python -m venv .venv && source .venv/bin/activate # or use conda/mamba
# 2) Install the project in editable mode + dev tools
pip install -e ".[dev]"Create a .env at the repo root (development convenience). In production/staging, set env vars in the runtime (12‑Factor).
GROQ_API_KEY=sk-...
GROQ_MODEL=llama-3.3-70b-versatile
GOOGLE_API_KEY=...
GEMINI_MODEL=gemini-2.0-flash
SERPER_API_KEY=...The code reads env via
pydantic-settings(v2). The values aren’t exported back to the OS environment; providers that require real env vars are mirrored by the container.
There’s a ready‑to‑run Jupyter demo:
demo_notebook.ipynb
It demonstrates:
- building the container
- ingesting a local PDF
- query rewriting for retrieval
- similarity‑thresholded FAISS retrieval
- judge + web fallback
- printing Answer and Sources
Run a question through the pipeline (uses local index when relevant, otherwise web fallback):
python -m cli.rag "What is Agentic RAG?"(Outputs the final answer and, when web fallback is used, a Sources list.)
from agenticrag.container import build_container
from agenticrag.app.build_index import ingest_pdfs_to_index
c = build_container()
indexed = ingest_pdfs_to_index([
"/absolute/path/to/your.pdf",
], c["vs"])
print("chunks indexed:", indexed)If you don’t want to re‑ingest every run, persist the vector store:
# after ingest
a = c["vs"]._vs
if a is not None:
a.save_local("var/faiss")
# later, to load it (requires the same embeddings object)
from langchain_community.vectorstores import FAISS
emb = c["vs"]._emb.lc # the underlying LangChain Embeddings
c["vs"]._vs = FAISS.load_local(
"var/faiss", embeddings=emb, allow_dangerous_deserialization=True
)The project uses
FAISS.from_texts(..., embedding=<Embeddings>)andadd_texts(...); directFAISS(...)construction expects low‑level internals and is not used.
src/agenticrag/
domain/ # pure, testable functions
ports/ # tiny interfaces (abstractions) used by domain
adapters/ # concrete tech (LangChain, FAISS, CrewAI, etc.)
app/ # orchestrated use-cases (RAG flow, ingestion)
container.py # dependency wiring (composition root)
src/cli/rag.py # CLI entrypoint (prints Answer + Sources)
- Hexagonal (Ports & Adapters): core depends on abstractions, not frameworks. Adapters plug concrete tech. Swap Groq→Gemini or FAISS→PGVector without touching domain.
- SOLID + Functional: small pure functions inside domain; interfaces at boundaries; single composition root for DI.
- 12‑Factor Config: configuration via env vars;
.envonly for local dev; no secrets in code. - Composability: domain functions are easy to wrap into LCEL/Runnables later (parallel branches, streaming, async) without refactoring core logic.
- Missing API keys: Ensure
.envexists at repo root and contains values above. When running from terminal, only OS env vars apply; our container mirrors keys intoos.environfor providers that require it. - “ModuleNotFoundError” after install: Verify you’re using the repo’s
.venvinterpreter and that you installed withpip install -e ".[dev]". - CrewAI tools: current signatures used here are
SerperDevTool.run(search_query=...)andScrapeWebsiteTool(website_url=...).run(). - FAISS scores:
similarity_search_with_scorereturns distances (lower is better). The app uses a similarity threshold and an LLM "judge" before web fallback.
pytest -qMIT (or your preferred license). Update as needed.