Upload any PDF and have a real conversation with it. Ask questions, get instant answers, and follow up naturally β all powered by RAG architecture and Groq's free Llama 3.3-70B model.
- π PDF Upload - Upload any textbook, notes, resume, or research paper
- π¬ Conversational Memory - Ask follow-up questions naturally, just like ChatGPT
- π§ RAG Architecture - Only relevant chunks sent to AI, saving API calls and cost
- π¨ Beautiful Dark UI - WhatsApp-style chat bubbles with a modern dark theme
- β‘ Lightning Fast - Powered by Groq's ultra-fast inference engine
- π 100% Free - HuggingFace embeddings run locally, Groq free tier for LLM
- π Secure - API keys never exposed, stored locally in
.env - ποΈ Smart Chunking - PDF split into 500-char chunks for precise retrieval
User: What are the skills mentioned in this document?
Buddy: The document mentions Python, Machine Learning, NLP, and LangChain...
User: Tell me more about the first one
Buddy: Regarding Python, the document highlights... β follow-up works!
User: What projects are listed?
Buddy: The document mentions the following projects...
π View Live Demo
| Technology | Purpose |
|---|---|
| Python 3.8+ | Core programming language |
| Streamlit | Web framework and UI |
| Groq API | AI model inference (Llama 3.3-70B) |
| LangChain | RAG pipeline orchestration |
| HuggingFace Embeddings | Local text embeddings (free, no API needed) |
| ChromaDB | Local vector database for chunk storage |
| python-dotenv | Environment variable management |
PDF Uploaded
β
Split into 500-char chunks
β
Each chunk converted to vector embeddings (HuggingFace, runs locally)
β
Stored in ChromaDB (local vector database)
β
User asks a question
β
Top 3 most relevant chunks retrieved
β
Chunks + conversation history sent to Groq (Llama 3.3-70B)
β
Answer returned with full conversational memory
- Python 3.8 or higher
- pip (Python package manager)
- A free Groq API key
git clone https://github.com/smebad/Smart-Study-Buddy.git
cd smart-study-buddyWindows (Git Bash):
python -m venv venv
source venv/Scripts/activateMac/Linux:
python3 -m venv venv
source venv/bin/activatepip install streamlit langchain langchain-community langchain-groq chromadb sentence-transformers pypdf python-dotenv langchain-text-splitters langchain-core --timeout 300Create a .env file in the root folder:
GROQ_API_KEY=your_actual_groq_api_key_hereHow to get your free Groq API key:
- Visit console.groq.com
- Sign up (no credit card required)
- Navigate to "API Keys"
- Click "Create API Key"
- Copy and paste into your
.envfile
streamlit run app.pyDrag and drop any PDF into the sidebar uploader β textbooks, notes, research papers, resumes, anything!
The app splits your PDF into chunks and indexes them locally. The first run downloads the embedding model (~80MB, one time only). After that it's instant.
Type your question in the chat box and get instant answers from your document. Ask follow-up questions naturally β the app remembers the full conversation context!
smart-study-buddy/
β
βββ app.py # Main Streamlit application
βββ .env # API key
βββ .gitignore # Git ignore rules
βββ README.md # Project documentation
βββ assets/
βββ screenshot1.png # App UI screenshot
βββ screenshot2.png # Chat in action screenshot
Edit app.py to switch to a different Groq model:
llm = ChatGroq(
model_name="llama-3.3-70b-versatile", # Default (best quality)
# Alternatives:
# model_name="llama-3.1-8b-instant" # Faster, lighter
# model_name="mixtral-8x7b-32768" # Good alternative
)Edit the splitter settings in app.py to control how the PDF is split:
splitter = RecursiveCharacterTextSplitter(
chunk_size=500, # Increase for more context per chunk
chunk_overlap=50 # Increase for better continuity between chunks
)Contributions are welcome! Here's how you can help:
- Fork the repository
- Create a feature branch
git checkout -b feature/AmazingFeature
- Commit your changes
git commit -m 'Add some AmazingFeature' - Push to the branch
git push origin feature/AmazingFeature
- Open a Pull Request
- π Support for multiple PDFs simultaneously
- πΎ Persistent chat history across sessions
- π Show source page numbers with each answer
- π Multi-language support
- π Highlight relevant text in the PDF viewer
- π± Mobile-optimized layout
- Rate Limits: Groq free tier has daily request limits
- PDF Size: Very large PDFs (100+ pages) may take longer to index
- Scanned PDFs: Image-based/scanned PDFs are not supported β text-based PDFs only
- Session Reset: Refreshing the browser clears the chat history and requires re-uploading the PDF
Future enhancements planned:
- π Support multiple PDFs at once
- πΎ Persistent chat history across sessions
- π Show which page/chunk the answer came from
- π Highlight relevant text in the PDF viewer
- π Multi-language support
- π± Mobile-optimized layout
- π User authentication for saved sessions
- π€ Export full chat history as PDF
- π¨ Light/dark mode toggle
- π Support for DOCX and TXT files
This project is licensed under the MIT License.
MIT License
Copyright (c) 2026 Syed Muhammad Ebad
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
Syed Muhammad Ebad
- πΌ LinkedIn: linkedin.com/in/syed-ebad-ml
- π GitHub: @smebad
- π§ Email: mohammdedbad1@hotmail.com
This project was built to solve a real problem: reading through long documents to find specific information is slow and frustrating. With RAG and AI, anyone can upload a document and instantly have a conversation with it β making studying, research, and document review 10x faster.
If you found this project helpful:
- β Star this repository
- π Report bugs via Issues
- π‘ Suggest features via Issues
- π’ Share with others who might find it useful
If you encounter any issues or have questions:
- π¬ Open an issue: GitHub Issues
- π§ Email me: mohammdedbad1@hotmail.com
- πΌ Connect on LinkedIn: Syed Ebad
pdf-qa rag langchain streamlit groq llama python chromadb huggingface generative-ai study-tool document-ai chatbot conversational-ai nlp
Made with β€οΈ by Syed Ebad

