A RAG (Retrieval-Augmented Generation) system designed to help HR professionals analyze CVs and answer questions about candidates efficiently.
This project was created to dive deeper into RAG (Retrieval-Augmented Generation) architecture and its practical applications in document analysis.
The original concept was to build an HR assistant using:
- Streamlit for the user interface
- LangChain for orchestrating the RAG pipeline
- ChromaDB as the vector database
- Ollama for local LLM deployment
- PDF processing to extract and analyze CV content
The system would allow HR professionals to upload CV PDFs and ask natural language questions about candidates.
Streamlit + LangChain + ChromaDB + Ollama (Gemma 2B)
Challenges encountered:
- Hardware limitations with local LLM deployment
- Accuracy issues with Gemma 2B model
- Inconsistent responses to HR questions
- Performance bottlenecks in document processing
Streamlit + PyPDF2 + Groq + Qwen
Improvements achieved:
- Better accuracy with Groq's infrastructure
- Faster response times
- More reliable Q&A capabilities
Ongoing
Through this project, I gained hands-on understanding of the RAG stack layers:
- Groq: High-performance LLM inference
- AWS/Google Cloud: Scalable infrastructure
- LangSmith: Monitoring and debugging
- Qwen: Advanced reasoning capabilities
- Gemma2B
- LangChain: RAG orchestration
- LlamaIndex: Document indexing
- Chroma: Open-source vector DB
- Ollama: Local embedding models
- OpenAI: High-quality embeddings
- Document Ingestion: PDF processing and chunking strategies
- Embedding Generation: Converting text to vector representations
- Vector Storage: Efficient similarity search and retrieval
- Query Processing: Natural language question understanding
- Context Retrieval: Finding relevant document sections
- Answer Generation: LLM-based response synthesis
- Hardware requirements for local vs cloud LLM deployment
- Model selection impact on accuracy and speed
- Chunking strategies for optimal retrieval
- Prompt engineering for HR-specific queries
- Local vs Cloud: Trade-offs between privacy and performance
- Model size: Balance between capability and resource usage
- Vector database: Choosing the right storage solution