📚 StudyBuddy RAG Assistant

An AI-powered study companion that uses Retrieval-Augmented Generation (RAG) to help students learn from their documents. Upload your study materials and get intelligent answers with source citations and personalized study tips.

Features

Document Processing: Upload and process PDF, TXT, and Markdown files
AI-Powered Q&A: Ask questions about your study materials and get contextual answers
Source Citations: Get references to specific documents and content snippets
💡Study Tips: Receive study recommendations based on your questions
Fast API Backend: RESTful API built with FastAPI
Web Frontend: Simple HTML frontend
Vector Search: Document retrieval using ChromaDB

Architecture

studybuddy-rag-assistant/
├── src/studybuddy/           # Main package
│   ├── main.py              # FastAPI application
│   ├── config.py            # Configuration settings
│   ├── models/              # Pydantic models
│   ├── core/                # RAG engine logic
│   ├── api/                 # API routes and dependencies
│   └── utils/               # Utility functions
├── documents/               # Upload your study materials here
├── vector_db/              # ChromaDB vector storage
├── frontend.html           # Web interface
└── pyproject.toml          # Python package configuration

Quick Start

Prerequisites

Python 3.9+
Poetry (for dependency management)
OpenAI API key

Installation

Clone the repository

git clone <your-repo-url>
cd studybuddy-rag-assistant

Install dependencies
```
poetry install
```

Set up environment variables

# Create .env file
echo "OPENAI_API_KEY=your_openai_api_key_here" > .env

Add your study materials

# Place your PDF, TXT, or MD files in the documents/ folder
cp your_study_materials.pdf documents/

Run the application

# Start the development server
poetry run dev

# Or run directly
poetry run uvicorn studybuddy.main:app --reload --host 0.0.0.0 --port 8000

Access the application
- API Documentation: http://localhost:8000/docs
- Web Interface: Open frontend.html in your browser
- Health Check: http://localhost:8000/api/v1/health

Usage

Web Interface

Open frontend.html in your web browser
Type your question in the chat interface
Get AI-powered answers with source citations and study tips

API Endpoints

Chat with StudyBuddy

curl -X POST "http://localhost:8000/api/v1/chat" \
     -H "Content-Type: application/json" \
     -d '{
       "question": "What are the main concepts in machine learning?",
       "include_sources": true,
       "max_sources": 3
     }'

Upload Documents

curl -X POST "http://localhost:8000/api/v1/upload" \
     -F "file=@your_document.pdf"

Health Check

curl "http://localhost:8000/api/v1/health"

Configuration

Configure the application by setting environment variables or modifying src/studybuddy/config.py:

Variable	Default	Description
`STUDYBUDDY_OPENAI_API_KEY`	-	Your OpenAI API key (required)
`STUDYBUDDY_OPENAI_MODEL`	`gpt-4o-mini`	OpenAI model to use
`STUDYBUDDY_CHUNK_SIZE`	`1000`	Document chunk size for processing
`STUDYBUDDY_MAX_SOURCES`	`3`	Maximum source documents to return
`STUDYBUDDY_DEBUG`	`false`	Enable debug mode

📁 Supported File Types

PDF (.pdf) - Research papers, textbooks, lecture notes
Text (.txt) - Plain text documents
Markdown (.md) - Formatted notes and documentation

How It Works

Document Processing: Your documents are split into chunks and converted into vector embeddings
Vector Storage: Embeddings are stored in ChromaDB for efficient similarity search
Question Processing: When you ask a question, the system finds the most relevant document chunks
Answer Generation: OpenAI's GPT model generates contextual answers based on retrieved content
Study Tips: Additional AI-generated study recommendations are provided

🔧 Development

Project Structure

src/studybuddy/
├── __init__.py
├── main.py                 # FastAPI app with lifespan management
├── config.py              # Pydantic settings with environment variables
├── models/
│   ├── requests.py        # ChatRequest, DocumentUploadRequest
│   └── responses.py       # ChatResponse, SourceDocument, etc.
├── core/
│   └── rag_engine.py      # StudyBuddyRAG class with core logic
├── api/
│   ├── dependencies.py    # FastAPI dependency injection
│   └── routes/
│       ├── health.py      # Health check endpoints
│       ├── chat.py        # Chat endpoints
│       └── documents.py   # Document upload endpoints
└── utils/
    └── document_processor.py

Key Components

StudyBuddyRAG: Core RAG engine handling document processing and question answering
FastAPI App: REST API with automatic OpenAPI documentation
Pydantic Models: Type-safe request/response models
ChromaDB: Vector database for document embeddings
LangChain: Framework for building the RAG pipeline

Running Tests

# Run tests (when implemented)
poetry run pytest

# Code formatting
poetry run black src/
poetry run isort src/

# Linting
poetry run flake8 src/

🛠️ Troubleshooting

Common Issues

OpenAI API Key Error
```
ValueError: OPENAI_API_KEY environment variable is required
```
Solution: Set your OpenAI API key in the .env file or environment variables (in this repo, I choose to set global environment, hence there is no need for .env.
Document Processing Fails
```
Error processing document.pdf: [Errno 2] No such file or directory
```
Solution: Ensure the document is in the documents/ directory and has a supported file extension.
Vector Database Issues
```
ChromaDB connection error
```
Solution: Clear the vector_db/ directory and restart the application.

Performance Tips

Chunk Size: Adjust chunk_size in config for your document types (larger for academic papers, smaller for notes)
Model Selection: Use gpt-4o-mini for cost efficiency or gpt-4 for better quality
Document Organization: Group related documents by subject for better retrieval

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

Built with following resources

Built with FastAPI for the web framework
LangChain for RAG implementation
ChromaDB for vector storage
OpenAI for language model capabilities

Happy Studying! 📚✨

Screen.Recording.2025-06-08.at.1.19.14.AM.mov

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

📚 StudyBuddy RAG Assistant

Features

Architecture

Quick Start

Prerequisites

Installation

Usage

Web Interface

API Endpoints

Chat with StudyBuddy

Upload Documents

Health Check

Configuration

📁 Supported File Types

How It Works

🔧 Development

Project Structure

Key Components

Running Tests

🛠️ Troubleshooting

Common Issues

Performance Tips

📄 License

Built with following resources

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
documents		documents
frontend		frontend
src/studybuddy		src/studybuddy
vector_db		vector_db
.gitignore		.gitignore
README.md		README.md
frontend.html		frontend.html
pyproject.toml		pyproject.toml

minhle35/Study-Assistant-RAG

Folders and files

Latest commit

History

Repository files navigation

📚 StudyBuddy RAG Assistant

Features

Architecture

Quick Start

Prerequisites

Installation

Usage

Web Interface

API Endpoints

Chat with StudyBuddy

Upload Documents

Health Check

Configuration

📁 Supported File Types

How It Works

🔧 Development

Project Structure

Key Components

Running Tests

🛠️ Troubleshooting

Common Issues

Performance Tips

📄 License

Built with following resources

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages