A minimal Retrieval-Augmented Generation (RAG) system built with Django + Qdrant + Gemini + Cohere.
This app lets users upload documents (PDF/DOCX) or paste text, indexes them into Qdrant with embeddings, and provides a chat-like interface to query across documents with reranking and grounding verification.
Hosted at: Render
- First screen loads without console errors.
- You can upload files, ask questions, and see retrieved chunks, reranked results, and grounded answers with citations.
Flow:
- Upload Text/File → Extract text → Chunk into 800 tokens with 80 overlap.
- Embeddings → Generated using Google Gemini (768-dim vectors).
- Vector DB → Stored in Qdrant with COSINE similarity.
- Retriever → Top-10 chunks retrieved.
- Reranker → Cohere Rerank API trims to top-3.
- LLM Answer → Gemini generates grounded response.
- Verification → Grounding check to reject hallucinations.
- UI → Tailwind + minimal JS.
- Chunk size: 800 tokens
- Chunk overlap: 80 tokens
- Retriever: Qdrant, top_k = 10
- Reranker: Cohere, top_k = 3
- LLM provider: Gemini (for embeddings + generation)
- Vector dimensions: 768
- Distance metric: Cosine
git clone https://github.com/<your-username>/<your-repo>.git
cd <your-repo>pip install -r requirements.txtCopy .env.example to .env and set your keys:
QDRANT_URL=...
QDRANT_API_KEY=...
QDRANT_COLLECTION=rag_demo
GEMINI_API_KEY=...
COHERE_API_KEY=...
DJANGO_SECRET_KEY=...python manage.py runserverVisit http://localhost:8000.
- Upload repo to PythonAnywhere.
- Set env vars under Web > WSGI config.
- Run migrations & reload app.
- Live Demo: Python Demo Link
- GitHub Repo: Repo Link
- Resume: Google Drive Link
