This project uses a Retrieval-Augmented Generation (RAG) model to answer questions based on a custom corpus created from PDF and DOCX files.
- Clone the repository.
- Install dependencies:
pip install -r requirements.txt. - Run the extraction:
sh scripts/run_extraction.sh. - Generate answers using the RAG model.
data/: Contains input files.corpus/: Stores processed corpus data.src/: Source code.models/: Trained models.notebooks/: Jupyter notebooks for experimentation.tests/: Unit tests.scripts/: Helper scripts.requirements.txt: Dependencies list.