A sophisticated Streamlit application that allows users to upload PDF files and interact with them using natural language queries, powered by binary quantization for efficient vector storage and retrieval.
- π Multi-PDF Upload: Upload one or multiple PDF files simultaneously
- π§ Binary Quantization: Efficient embedding storage using binary quantization
- π¬ Interactive Chat: Natural language conversation with your PDFs
- β±οΈ Response Time Tracking: Real-time performance metrics in milliseconds
- π PDF Preview: File details including page count and size
- ποΈ Vector Database: Milvus-powered semantic search
- π€ Advanced LLM: Groq integration for fast response generation
- Frontend: Streamlit
- Embeddings: OpenAI text-embedding-3-small
- Vector Database: Milvus with HAMMING distance
- LLM: Groq (moonshotai/kimi-k2-instruct)
- PDF Processing: PyPDF2
- Binary Quantization: NumPy-based optimization
boost-rag-with-binary-quantization/
βββ streamlit_main.py          # Main Streamlit application
βββ embedding.py               # Binary quantization embedding logic
βββ retriever_llm_index.py     # Retrieval and LLM integration
βββ requirements.txt           # Python dependencies
βββ run_app.sh                # Application launcher script
βββ .env.example              # Environment variables template
βββ docker-compose.yml        # Docker configuration
βββ docs/                     # PDF documents directory
    βββ llm.pdf
# Clone or navigate to the project directory
cd boost-rag-with-binary-quantization
# Create virtual environment (recommended)
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txtCreate a .env file based on .env.example:
cp .env.example .envEdit .env and add your API keys:
OPENAI_API_KEY=your_openai_api_key_here
GROQ_API_KEY=your_groq_api_key_here./run_app.shstreamlit run streamlit_main.pyThe application will open in your browser at http://localhost:8501
- Use the sidebar to upload one or multiple PDF files
- View file details in the PDF Preview section
- See the number of text chunks extracted from each file
- Click the "π§ Create Embeddings" button in the sidebar
- Wait for the binary quantization process to complete
- The system will create a Milvus vector database with your content
- Use the chat interface in the main area
- Ask questions about your uploaded PDFs
- View response times for each interaction
- Clear chat history when needed
- Text Extraction: PDFs are processed and split into chunks
- Float32 Embeddings: Generated using OpenAI's text-embedding-3-small
- Binary Conversion: Float values > 0 become 1, others become 0
- Byte Packing: Binary vectors are packed into bytes for storage
- Milvus Storage: Stored with HAMMING distance indexing
- Storage Efficiency: 32x reduction in storage space
- Query Speed: Faster similarity search with binary operations
- Memory Usage: Significantly reduced RAM requirements
- Scalability: Better performance with large document collections
embedding_model = OpenAIEmbedding(model="text-embedding-3-small")llm = Groq(
    model="moonshotai/kimi-k2-instruct",
    api_key=os.environ.get("GROQ_API_KEY"),
    temperature=0.5,
    max_tokens=1000
)search_params = {"metric_type": "HAMMING"}
limit = 5  # Number of retrieved documentsRun with Docker Compose:
docker-compose up -dThe application tracks and displays:
- Response Time: LLM generation time in milliseconds
- Embedding Creation: Progress and completion status
- File Processing: Upload and parsing status
- Vector Search: Retrieval performance
- 
Missing API Keys - Ensure .envfile exists with valid API keys
- Check OpenAI and Groq API key formats
 
- Ensure 
- 
PDF Processing Errors - Verify PDF files are not corrupted
- Check file size limitations
- Ensure text-extractable PDFs (not image-only)
 
- 
Vector Database Issues - Delete milvus_data.dband recreate embeddings
- Check disk space availability
- Verify Milvus dependencies
 
- Delete 
- 
Performance Issues - Reduce number of documents retrieved (limit parameter)
- Use smaller PDF files for testing
- Monitor system memory usage
 
- streamlit_main.py: Main application with UI components
- Binary quantization functions: Embedding conversion logic
- Vector store management: Milvus collection handling
- Chat interface: Message history and response generation
- extract_text_from_pdf(): PDF text extraction
- create_binary_embeddings(): Embedding quantization
- setup_vector_store(): Milvus database setup
- retrieve_context(): Semantic search
- generate_response(): LLM interaction
- 
OpenAI API Key: For text embeddings - Get from: https://platform.openai.com/api-keys
- Used for: text-embedding-3-small model
 
- 
Groq API Key: For LLM inference - Get from: https://console.groq.com/keys
- Used for: moonshotai/kimi-k2-instruct model
 
- Fork the repository
- Create a feature branch
- Make your changes
- Test thoroughly
- Submit a pull request
This project is open source. Please check the license file for details.
For issues and questions:
- Check the troubleshooting section
- Review the code documentation
- Open an issue on the repository
Happy Chatting with your PDFs! π