Skip to content

nphuong302/rag-chatbot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

📄 CV RAG System - HR Assistant

A RAG (Retrieval-Augmented Generation) system designed to help HR professionals analyze CVs and answer questions about candidates efficiently.

🎯 Project Overview

This project was created to dive deeper into RAG (Retrieval-Augmented Generation) architecture and its practical applications in document analysis.

Initial Idea

The original concept was to build an HR assistant using:

  • Streamlit for the user interface
  • LangChain for orchestrating the RAG pipeline
  • ChromaDB as the vector database
  • Ollama for local LLM deployment
  • PDF processing to extract and analyze CV content

The system would allow HR professionals to upload CV PDFs and ask natural language questions about candidates.

🛠️ Technical Journey

First Approach (Local Setup)

Streamlit + LangChain + ChromaDB + Ollama (Gemma 2B)

Challenges encountered:

  • Hardware limitations with local LLM deployment
  • Accuracy issues with Gemma 2B model
  • Inconsistent responses to HR questions
  • Performance bottlenecks in document processing

Alternative Solution (Cloud-based)

Streamlit + PyPDF2 + Groq + Qwen

Improvements achieved:

  • Better accuracy with Groq's infrastructure
  • Faster response times
  • More reliable Q&A capabilities

Alternatives and Implementation

Ongoing

RAG Architecture Deep Dive

Through this project, I gained hands-on understanding of the RAG stack layers:

Level 0 - Deployment

  • Groq: High-performance LLM inference
  • AWS/Google Cloud: Scalable infrastructure

Level 1 - Evaluation

  • LangSmith: Monitoring and debugging

Level 2 - LLMs

  • Qwen: Advanced reasoning capabilities
  • Gemma2B

Level 3 - Framework

  • LangChain: RAG orchestration
  • LlamaIndex: Document indexing

Level 4 - Vector Database

  • Chroma: Open-source vector DB

Level 5 - Embedding

  • Ollama: Local embedding models
  • OpenAI: High-quality embeddings

📚 Key Learnings

RAG Pipeline Understanding

  1. Document Ingestion: PDF processing and chunking strategies
  2. Embedding Generation: Converting text to vector representations
  3. Vector Storage: Efficient similarity search and retrieval
  4. Query Processing: Natural language question understanding
  5. Context Retrieval: Finding relevant document sections
  6. Answer Generation: LLM-based response synthesis

Performance Optimization

  • Hardware requirements for local vs cloud LLM deployment
  • Model selection impact on accuracy and speed
  • Chunking strategies for optimal retrieval
  • Prompt engineering for HR-specific queries

Infrastructure Decisions

  • Local vs Cloud: Trade-offs between privacy and performance
  • Model size: Balance between capability and resource usage
  • Vector database: Choosing the right storage solution

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages