ReRag: a Reconfigurable Retrieval-Augmented-Generation Experimentation and Validation framework

Version: 2.0.0
Author: Spiros Chatzigeorgiou

Production-ready Retrieval-Augmented Generation (RAG) system with hybrid retrieval, Self-RAG agent workflows, cross-encoder reranking, and comprehensive benchmarking.

🚀 Quick Start

Prerequisites

Python 3.11+
Docker & Docker Compose
16GB+ RAM recommended
API keys: Google AI, OpenAI (optional: Voyage AI)

1. Setup Environment

# Clone repository
git clone <repository-url>
cd ReRag

# Create virtual environment
python -m venv venv
source venv/bin/activate  # Linux/Mac
# venv\Scripts\activate   # Windows

# Install dependencies
pip install -r requirements.txt

# Configure API keys
cp .env_example .env
# Edit .env and add your API keys:
#   GOOGLE_API_KEY=your_key_here
#   OPENAI_API_KEY=your_key_here

2. Start Vector Database

# Start Qdrant
docker-compose up -d

# Verify it's running
curl http://localhost:6333/healthz

#You can see the ingestion results in Qdrant's Web UI visiting the link below:
http://localhost:6333/dashboard#/collections

3. Run Your First Pipeline

#First download the dataset from the scripts folder
# Ingest documents (requires dataset - see Data Ingestion section)
python bin/ingest.py ingest --config pipelines/configs/datasets/stackoverflow_hybrid.yml

# Run agent in interactive mode
python main.py

# Run agent with single query
python main.py --query "What are Python best practices?"

# Run Self-RAG mode (with iterative refinement)
python main.py --mode self-rag --query "Explain how asyncio works"

📚 User Guide

Data Ingestion

Ingest documents into the vector database:

# Basic ingestion from config
python bin/ingest.py ingest --config pipelines/configs/datasets/stackoverflow_hybrid.yml

# Test with dry run (no upload)
python bin/ingest.py ingest --config my_config.yml --dry-run --max-docs 100

# Check ingestion status
python bin/ingest.py status

# Cleanup canary collections
python bin/ingest.py cleanup

Configuration File Format (pipelines/configs/datasets/*.yml):

dataset:
  name: "my_dataset"
  adapter: "stackoverflow"  # or full path: "pipelines.adapters.custom.MyAdapter"
  path: "datasets/sosum/data"

embedding:
  strategy: "hybrid"  # or "dense" or "sparse"
  dense:
    provider: "google"
    model: "text-embedding-004"
  sparse:
    provider: "sparse"
    model: "Qdrant/bm25"

qdrant:
  collection: "my_collection"
  host: "localhost"
  port: 6333

Retrieval Testing

Test retrieval pipelines before using in agents:

# Use any retrieval configuration
python bin/retrieval_pipeline.py \
  --config pipelines/configs/retrieval/basic_dense.yml \
  --query "How to handle Python exceptions?" \
  --top-k 5

Agent Workflows

Run the RAG agent with two available modes:

# Standard RAG mode (single-pass)
python main.py --query "Explain Python decorators"

# Self-RAG mode (iterative refinement with verification)
python main.py --mode self-rag --query "How does asyncio work?"

# Interactive chat
python main.py
# or
python main.py --mode self-rag

Benchmarking

Run evaluation experiments:

# Run experiment with output directory
python -m benchmarks.experiment1 --output-dir results/exp1

# Run 2D grid optimization for hybrid search parameters
python -m benchmarks.optimize_2d_grid_alpha_rrfk \
  --scenario-yaml benchmark_scenarios/your_scenario.yml \
  --dataset-path datasets/sosum/data \
  --n-folds 5 \
  --output-dir results/optimization

# Generate ground truth for evaluation
python -m benchmarks.generate_ground_truth \
  --queries-file queries.json \
  --output-file ground_truth.json

See benchmarks/README.md for detailed documentation.

📖 System Architecture

Overview

Modular RAG system with three main subsystems:

┌────────────────────────────────────────────────────────────┐
│                     RAG System                             │
├────────────────────────────────────────────────────────────┤
│                                                            │
│  📊 INGESTION → 🔍 RETRIEVAL → 🤖 AGENT                    │
│                                                            │
│  Documents      Vector Search    LangGraph                 │
│  Chunking       Reranking        Response Gen              │
│  Embedding      Filtering        Verification              │
│  ↓                ↓                 ↓                      │
│  └───────────→ Qdrant ←───────────┘                        │
│                                                            │
│  📈 BENCHMARKS: Evaluation & Optimization                  │
└────────────────────────────────────────────────────────────┘

Core Components

Component	Purpose	Documentation
pipelines/	Data ingestion & processing	README
components/	Retrieval pipeline (filters, rerankers)	README
embedding/	Multi-provider embeddings	README
retrievers/	Dense/sparse/hybrid search	README
agent/	LangGraph workflows (Standard + Self-RAG)	README
database/	Qdrant vector database interface	README
benchmarks/	Evaluation framework	README
config/	Configuration system	-

🔧 Installation

1. Python Environment

# Clone repository
git clone <repository-url>
cd Thesis

# Create virtual environment (Python 3.11+ required)
python -m venv venv
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

2. API Keys

# Create environment file
cp .env_example .env

Edit .env and add your API keys:

# Required
GOOGLE_API_KEY=your_key_here
OPENAI_API_KEY=your_key_here

# Optional
VOYAGE_API_KEY=your_key_here

3. Start Vector Database

# Start Qdrant using Docker
docker-compose up -d

# Verify it's running
curl http://localhost:6333/health

📁 Project Structure

Thesis/
├── readme.md                      # This file
├── main.py                        # Agent entry point (Standard & Self-RAG modes)
├── config.yml                     # Main configuration file
├── docker-compose.yml             # Qdrant database setup
├── requirements.txt               # Python dependencies
│
├── agent/                         # LangGraph agent workflows
│   ├── graph_refined.py          # Standard RAG workflow
│   ├── graph_self_rag.py         # Self-RAG workflow (iterative refinement)
│   ├── schema.py                 # State definitions
│   └── nodes/                    # Agent nodes (retriever, generator, grader)
│
├── pipelines/                     # Data ingestion
│   ├── adapters/                 # Dataset adapters (StackOverflow, custom)
│   ├── ingest/                   # Ingestion pipeline core
│   ├── eval/                     # Retrieval evaluator
│   └── configs/                  # Dataset configurations
│       └── datasets/             # Per-dataset configs
│
├── components/                    # Retrieval pipeline components
│   ├── retrieval_pipeline.py    # Pipeline orchestration
│   ├── rerankers.py             # CrossEncoder, Semantic, ColBERT, MultiStage
│   ├── filters.py               # Tag, duplicate, relevance filters
│   └── post_processors.py       # Result enhancement & limiting
│
├── retrievers/                    # Core retrieval implementations
│   ├── dense_retriever.py       # Dense/sparse/hybrid retrieval
│   └── base.py                  # Abstract interfaces
│
├── embedding/                     # Embedding providers
│   ├── factory.py               # Provider factory
│   ├── providers/               # Google, OpenAI, Voyage, HuggingFace
│   └── base_embedder.py        # Abstract interfaces
│
├── database/                      # Vector database
│   ├── qdrant_controller.py    # Qdrant integration
│   └── base.py                  # Abstract interfaces
│
├── config/                        # Configuration system
│   ├── config_loader.py         # YAML config loader
│   └── llm_factory.py           # LLM provider factory
│
├── benchmarks/                    # Evaluation framework
│   ├── experiment1.py           # Main experiment runner
│   ├── optimize_2d_grid_alpha_rrfk.py  # Grid search optimization
│   ├── llm_as_judge_eval.py     # LLM-based evaluation
│   ├── generate_ground_truth.py # Ground truth generation
│   ├── benchmarks_runner.py     # Core benchmark runner
│   ├── benchmarks_metrics.py    # Metrics (Recall, Precision, MRR, NDCG)
│   ├── report_generator.py      # Report generation (used by experiments)
│   └── statistical_analyzer.py  # Statistical analysis
│
├── bin/                          # CLI tools
│   ├── ingest.py                # Ingestion CLI
│   ├── retrieval_pipeline.py   # Retrieval testing CLI
│   ├── qdrant_inspector.py     # Database inspection
│   └── switch_agent_config.py  # Config switcher
│
├── logs/                         # Application logs
│   ├── agent.log                # Main agent log
│   ├── ingestion.log            # Ingestion log
│   └── utils/logger.py          # Custom logger
│
└── tests/                        # Test suite
    ├── test_self_rag_integration.py  # Self-RAG integration tests
    └── [other test files]

⚙️ Configuration

Configuration Files

Main Config (config.yml):

System-wide settings
Loaded by config/config_loader.py

Pipeline Configs (pipelines/configs/):

datasets/ - Dataset-specific configs (ingestion)
retrieval/ - Retrieval pipeline configs

Example: Ingestion Config

dataset:
  name: "stackoverflow"
  adapter: "stackoverflow"  # or full path
  path: "datasets/sosum/data"

embedding:
  strategy: "hybrid"  # dense, sparse, or hybrid
  dense:
    provider: "google"
    model: "text-embedding-004"
  sparse:
    provider: "sparse"
    model: "Qdrant/bm25"

qdrant:
  collection: "my_collection"
  host: "localhost"
  port: 6333

Environment Variables

Variable	Description	Required
`GOOGLE_API_KEY`	Google AI API key	Yes
`OPENAI_API_KEY`	OpenAI API key	Yes
`VOYAGE_API_KEY`	Voyage AI API key	No

🔌 Extension Points

Add Custom Dataset Adapter

Create adapter class:

# pipelines/adapters/my_adapter.py
from pipelines.contracts import BaseAdapter, Document

class MyAdapter(BaseAdapter):
    def load_documents(self) -> List[Document]:
        # Load your data
        return documents

Use in config:

dataset:
  adapter: "pipelines.adapters.my_adapter.MyAdapter"
  path: "path/to/data"

Add Custom Reranker

Implement in components/rerankers.py or components/advanced_rerankers.py:

from components.rerankers import BaseReranker

class MyReranker(BaseReranker):
    def rerank(self, query: str, results: List[SearchResult]) -> List[SearchResult]:
        # Your reranking logic
        return reranked_results

Add Custom Agent Node

Create node in agent/nodes/:

from agent.schema import AgentState

def my_node(state: AgentState) -> AgentState:
    # Process state
    return state

Add to graph in agent/graph_refined.py or agent/graph_self_rag.py

🎯 Key Features

Retrieval Strategies

Dense Retrieval: Semantic search using embeddings (Google, OpenAI, Voyage, HuggingFace)
Sparse Retrieval: BM25-style keyword matching (Qdrant/bm25, SPLADE)
Hybrid Retrieval: Combines dense + sparse with RRF (Reciprocal Rank Fusion)

Reranking

Cross-Encoder: ms-marco-MiniLM-L-6-v2 (default)
Semantic: Sentence transformers for semantic similarity
ColBERT: Token-level contextual matching
Multi-Stage: Cascading rerankers for efficiency

Agent Modes

Standard RAG: Single-pass retrieval → generation
Self-RAG: Iterative refinement with hallucination detection and context verification

Benchmarking

Metrics: Recall@K, Precision@K, MRR, NDCG@K
Optimization: Grid search for hybrid parameters (alpha, RRF-k)
LLM-as-Judge: Automated quality evaluation (faithfulness, relevance, helpfulness)
Statistical Analysis: Cross-validation, significance testing

📊 Testing

Run Integration Tests

# Self-RAG integration tests
pytest tests/test_self_rag_integration.py -v

# All tests
pytest tests/ -v

Verify Components

See components/LOGGING_GUIDE.md for how to verify rerankers and filters are working correctly via logs.

🔍 CLI Tools

Tool	Purpose	Example
`bin/ingest.py`	Ingest datasets	`python bin/ingest.py ingest --config my_config.yml`
`bin/retrieval_pipeline.py`	Test retrieval	`python bin/retrieval_pipeline.py --config config.yml --query "test"`
`bin/qdrant_inspector.py`	Inspect database	`python bin/qdrant_inspector.py list`
`bin/switch_agent_config.py`	Switch configs	`python bin/switch_agent_config.py`

📈 System Requirements

Minimum:

Python 3.11+
8GB RAM
10GB storage

Recommended:

16GB+ RAM
SSD storage
4+ CPU cores

📚 Documentation

Main README: This file
Components: components/README.md - Retrieval pipeline components
Pipelines: pipelines/README.md - Data ingestion system
Benchmarks: benchmarks/README.md - Evaluation framework
Agent: agent/README.md - LangGraph workflows
CLI Reference: CLI_REFERENCE.md - Command-line tools
Logging Guide: components/LOGGING_GUIDE.md - Verify components work

🛠️ Technologies

LangGraph: Agent workflow orchestration
Qdrant: Vector database
LangChain: Document processing
Sentence Transformers: Embeddings and reranking
Pydantic: Data validation

📧 Contact

Author: Spiros Chatzigeorgiou
Email: spyrchat@ece.auth.gr

Built for production RAG workflows with hybrid retrieval, advanced reranking, and comprehensive evaluation.

Name		Name	Last commit message	Last commit date
Latest commit History 204 Commits
agent		agent
benchmark_scenarios		benchmark_scenarios
benchmarks		benchmarks
bin		bin
components		components
config		config
database		database
datasets		datasets
embedding		embedding
experiments		experiments
logs/utils		logs/utils
output		output
pipelines		pipelines
results		results
retrievers		retrievers
scripts		scripts
tests		tests
.env_example		.env_example
.gitignore		.gitignore
CLI_REFERENCE.md		CLI_REFERENCE.md
Dockerfile		Dockerfile
config.yml		config.yml
docker-compose.yml		docker-compose.yml
main.py		main.py
pytest.ini		pytest.ini
readme.md		readme.md
requirements.txt		requirements.txt

spyrchat/ReRag

Folders and files

Latest commit

History

Repository files navigation

ReRag: a Reconfigurable Retrieval-Augmented-Generation Experimentation and Validation framework

🚀 Quick Start

Prerequisites

1. Setup Environment

2. Start Vector Database

3. Run Your First Pipeline

📚 User Guide

Data Ingestion

Retrieval Testing

Agent Workflows

Benchmarking

📖 System Architecture

Overview

Core Components

🔧 Installation

1. Python Environment

2. API Keys

3. Start Vector Database

📁 Project Structure

⚙️ Configuration

Configuration Files

Environment Variables

🔌 Extension Points

Add Custom Dataset Adapter

Add Custom Reranker

Add Custom Agent Node

🎯 Key Features

Retrieval Strategies

Reranking

Agent Modes

Benchmarking

📊 Testing

Run Integration Tests

Verify Components

🔍 CLI Tools

📈 System Requirements

📚 Documentation

🛠️ Technologies

📧 Contact

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages