📄 CV RAG System - HR Assistant

A RAG (Retrieval-Augmented Generation) system designed to help HR professionals analyze CVs and answer questions about candidates efficiently.

🎯 Project Overview

This project was created to dive deeper into RAG (Retrieval-Augmented Generation) architecture and its practical applications in document analysis.

Initial Idea

The original concept was to build an HR assistant using:

Streamlit for the user interface
LangChain for orchestrating the RAG pipeline
ChromaDB as the vector database
Ollama for local LLM deployment
PDF processing to extract and analyze CV content

The system would allow HR professionals to upload CV PDFs and ask natural language questions about candidates.

🛠️ Technical Journey

First Approach (Local Setup)

Streamlit + LangChain + ChromaDB + Ollama (Gemma 2B)

Challenges encountered:

Hardware limitations with local LLM deployment
Accuracy issues with Gemma 2B model
Inconsistent responses to HR questions
Performance bottlenecks in document processing

Alternative Solution (Cloud-based)

Streamlit + PyPDF2 + Groq + Qwen

Improvements achieved:

Better accuracy with Groq's infrastructure
Faster response times
More reliable Q&A capabilities

Alternatives and Implementation

Ongoing

RAG Architecture Deep Dive

Through this project, I gained hands-on understanding of the RAG stack layers:

Level 0 - Deployment

Groq: High-performance LLM inference
AWS/Google Cloud: Scalable infrastructure

Level 1 - Evaluation

LangSmith: Monitoring and debugging

Level 2 - LLMs

Qwen: Advanced reasoning capabilities
Gemma2B

Level 3 - Framework

LangChain: RAG orchestration
LlamaIndex: Document indexing

Level 4 - Vector Database

Chroma: Open-source vector DB

Level 5 - Embedding

Ollama: Local embedding models
OpenAI: High-quality embeddings

📚 Key Learnings

RAG Pipeline Understanding

Document Ingestion: PDF processing and chunking strategies
Embedding Generation: Converting text to vector representations
Vector Storage: Efficient similarity search and retrieval
Query Processing: Natural language question understanding
Context Retrieval: Finding relevant document sections
Answer Generation: LLM-based response synthesis

Performance Optimization

Hardware requirements for local vs cloud LLM deployment
Model selection impact on accuracy and speed
Chunking strategies for optimal retrieval
Prompt engineering for HR-specific queries

Infrastructure Decisions

Local vs Cloud: Trade-offs between privacy and performance
Model size: Balance between capability and resource usage
Vector database: Choosing the right storage solution

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
0.4.0		0.4.0
README.md		README.md
app.py		app.py
appOld.py		appOld.py
environment.yml		environment.yml
package-lock.json		package-lock.json
package.json		package.json
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📄 CV RAG System - HR Assistant

🎯 Project Overview

Initial Idea

🛠️ Technical Journey

First Approach (Local Setup)

Alternative Solution (Cloud-based)

Alternatives and Implementation

RAG Architecture Deep Dive

Level 0 - Deployment

Level 1 - Evaluation

Level 2 - LLMs

Level 3 - Framework

Level 4 - Vector Database

Level 5 - Embedding

📚 Key Learnings

RAG Pipeline Understanding

Performance Optimization

Infrastructure Decisions

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

📄 CV RAG System - HR Assistant

🎯 Project Overview

Initial Idea

🛠️ Technical Journey

First Approach (Local Setup)

Alternative Solution (Cloud-based)

Alternatives and Implementation

RAG Architecture Deep Dive

Level 0 - Deployment

Level 1 - Evaluation

Level 2 - LLMs

Level 3 - Framework

Level 4 - Vector Database

Level 5 - Embedding

📚 Key Learnings

RAG Pipeline Understanding

Performance Optimization

Infrastructure Decisions

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages