Skip to content

mwheeler235/pdf-llm-rag-chat

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

49 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

1. Resume RAG Evaluation System - Streamlit App

Do you want to evaluate a resume against a Job Description and a Company Description from say, Linkedin? This App allows you to assume the role of a Recruiter for the same Company and gives a detailed explanation regarding candidate fit based on the uploaded Resume.

This is a Streamlit web application for evaluating resumes using RAG (Retrieval-Augmented Generation) and RAGAS evaluation metrics.

Try the Resume RAG Evaluation System – Access the live Streamlit app for resume evaluation. You will need your own OpenAI API Key.

Features

📄 Document Processing

  • Supports PDF and DOCX resume files
  • Vector database creation using FAISS and OpenAIEmbeddings

👤 Candidate Evaluation

  • Input company descriptions and job requirements
  • Generate structured candidate evaluation reports
  • Uses faithful RAG prompting to prevent hallucination
  • Enhanced retrieval for comprehensive analysis

📊 RAGAS Evaluation

  • Custom question and ground truth input
  • Comprehensive evaluation using 4 key metrics:
    • Faithfulness: Answer grounding in context
    • Answer Relevancy: Relevance to the question
    • Answer Correctness: Accuracy compared to ground truth
    • Semantic Similarity: Semantic similarity to expected answer

LangChain Architecture

Two RAG Chains are created for the two separate Streamlit components.

  • Evaluation Chain to evaluate the input resume
  • RAGAS Chain to evaluate customer user questions with a more faithful answer generation

Evaluation LangChain Flow:

  1. Input: Question about candidate evaluation
  2. Retrieval: Multi-query retriever gets relevant resume chunks
  3. Context Injection: Retrieved content becomes {context}, original question becomes {question}
  4. Prompt Template: Fills the evaluation template with context and question
  5. LLM Processing: GPT-3.5-turbo generates structured evaluation
  6. Output Parsing: Converts response to clean string

RAGAS LangChain Flow:

  1. Input: Specific question for RAGAS evaluation
  2. Retrieval: Gets top 12 similar resume chunks
  3. Context Injection: Retrieved content + question into template
  4. Strict Prompting: Forces LLM to only use provided context
  5. LLM Processing: Generates faithful, grounded answer
  6. Output: Clean string for RAGAS metric evaluation

2. PDF Analysis Introduction

Let's build a Chat solution for interfacing with PDF's.

What's in the "Big Beautiful Bill Act"?

This Act is easy to find with a quick google search (https://www.congress.gov/bill/119th-congress/house-bill/1/text). Given that the document is quite long and that I'd rather where my data science hat than my legal expert hat, let's leverage some applications and libraries to have a conversation about this Act.

Tech

  1. LangChain: It is a powerful framework of libraries for using large language models effectively.

  2. Ollama: It is a platform which allows us to run the large language models locally in our machine, so that we don’t end up paying and using cloud based services to access the LLMs (also keeping data local).

  3. OllamaEmbedings (nomic_embed_text): An API provided by Nomic to generate quality embeddings from text data. Embeddings are dense vector representation of text capturing the semantic meaning, enabling tasks like clustering, similarity search and visualization.

  4. ChromaDB: Open-source vector database designed for managing querying embeddings.

  5. Llama 3.1 LLM: Large language model created by Meta

  6. Ragas: a library that provides tools to supercharge the evaluation of Large Language Model (LLM) applications. It is designed to help you evaluate your LLM applications with ease and confidence.

Check out my Medium Blog!

License

This project is licensed under the MIT License - see the LICENSE file for details.

Author

Mateo Wheeler

About

Streamlit, LangChain, OpenAI, FAISS, Ollama, ChromaDB, Llama 3.1 for PDF RAG Chat Interaction

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published