A modern, AI study assistant that runs completely offline
Smart conversation management β’ Context-aware responses β’ Minimalist web UI
EDITH is your personal AI study assistant that helps you make sense of your notes using local LLMs. She features a modern interface with conversation management, intelligent query classification, and context-aware responses. Best of all? She runs completely offline using LLaMA 3.1.
β¨ Modern Web Interface
- Modern landing page with welcoming design
- Conversation management (create, save, switch, delete)
- Clean, animated UI with expandable sidebar
- Real-time typing indicators and status updates
π§ Intelligent AI Assistant
- Context-aware responses that reference previous messages
- Automatic classification between knowledge queries and casual chat
- RAG (Retrieval-Augmented Generation) for note-based answers
- Conversational mode for general questions
π Powerful Note Processing
- Multi-format support (PDF, DOCX, images with OCR, text files)
- Drag-and-drop or multi-file upload
- Automatic text chunking and embedding generation
- Vector database storage with Pinecone for fast retrieval
π Privacy First
- 100% local LLM execution via Ollama
- No data sent to external servers
- Your notes stay on your machine
# Download Ollama from: https://ollama.ai/download
# Then pull LLaMA 3.1:
ollama pull llama3.1:8b-instruct-q4_K_M# Clone the repository
git clone https://github.com/ChrisDanielW/EDITH.git
cd EDITH
# Install Python dependencies
pip install -r requirements.txt# Create a .env and copy the contents of "env-example.txt" provided in the root directory to it
# Then edit the .env and add your Pinecone API key
notepad .envGet a free Pinecone API key at pinecone.io
# Start the web UI and API server
python start_ui.pyOpen your browser to http://localhost:5000 and start chatting!
- Upload Your Notes
- Click the π Upload button in the sidebar
- Select or drag-and-drop your documents (PDF, DOCX, TXT, images)
- EDITH will process and index them automatically
- Start a Conversation
- Type your first message on the landing page
- A new numbered conversation will be created automatically
- Ask questions about your notes or just chat casually
Asking About Notes:
You: What is polymorphism in OOP?
EDITH: [Searches your notes and provides detailed explanation]
Casual Conversation:
You: Hey, how's it going?
EDITH: [Responds naturally without searching notes]
Follow-up Questions:
You: Can you explain that in more detail?
EDITH: [References previous conversation context]
- New Conversation: Click the β button (appears when in a conversation)
- Switch Conversations: Click any conversation in the left sidebar
- Delete Conversation: Click the Γ button on any conversation
- Return to Landing: Click the hamburger menu (β°) to collapse sidebar
Frontend:
- Vanilla HTML, CSS, JavaScript
- LocalStorage for conversation persistence
- Modern animated UI with responsive design
Backend:
- Flask REST API
- Python 3.8+
- Ollama for LLM execution
AI/ML:
- LLaMA 3.1 (8B Instruct, 4-bit quantized)
- Sentence Transformers for embeddings
- Pinecone vector database
- RAG architecture for context retrieval
- Document Upload β Text extraction & chunking β Embedding generation β Store in Pinecone
- User Query β Classify (knowledge vs. casual) β Retrieve relevant chunks (if knowledge) β Generate answer with conversation context
- Conversation History β Last 3 exchanges sent with each query β Context-aware responses
EDITH/
βββ ui/ # Web interface
β βββ index.html # Main HTML
β βββ styles.css # Styling
β βββ app.js # Frontend logic
βββ src/
β βββ main.py # Core EDITH class
β βββ api/
β β βββ app.py # Flask API server
β βββ models/
β β βββ llama_client.py # LLM interface
β βββ services/
β β βββ rag_service.py # RAG pipeline
β β βββ vector_store.py # Pinecone integration
β β βββ note_analyzer.py # Document analysis
β β βββ summarizer.py # Summarization
β βββ utils/
β β βββ document_loader.py # File loading
β β βββ text_chunker.py # Smart chunking
β β βββ embeddings.py # Embedding generation
β β βββ query_classifier.py # Query classification
β βββ config/
β βββ settings.py # Configuration
βββ start_ui.py # Launch script
βββ requirements.txt # Python dependencies
βββ README.md # This file
- RAG Mode: 500 tokens (detailed educational responses)
- Conversational Mode: 350 tokens (natural chat)
- Fallback Mode: 400 tokens (general knowledge)
Edit src/config/settings.py to change models:
# Current default
MODEL_NAME = "llama3.1:8b-instruct-q4_K_M"
# For more powerful responses (slower, needs more RAM)
MODEL_NAME = "llama3.1:70b-instruct-q4_K_M"EDITH uses Pinecone with these settings:
- Top K: 3 most relevant chunks
- Similarity Threshold: 0.7
- Max Context: 2000 characters
- Persistent storage in browser localStorage
- Numbered conversations (1, 2, 3...)
- Auto-save after every message
- Landing page shows on startup
- Automatic classification of user intent
- Knowledge queries β RAG mode (searches notes)
- Casual queries β Conversational mode (direct chat)
- Hybrid queries β RAG with conversational tone
- Sends last 6 messages (3 exchanges) with each query
- References previous conversation naturally
- Maintains conversation flow across messages
Contributions are welcome! Areas for improvement:
- Export conversations to PDF/text
- Search within conversations
- Custom system prompts per conversation
- Markdown rendering in responses
- Code syntax highlighting
- Voice input/output
See EDITH in action - uploading documents, managing conversations, and answering questions:
Click the link above to download and watch the demo
MIT License - See LICENSE for details
- LLaMA 3.1 by Meta AI
- Ollama for easy local LLM deployment
- Pinecone for vector database
- Sentence Transformers for embeddings
Made for students who want to study smarter (and also in the hopes of possibly getting an internship)



