GitHub - Drexter-07/DocumindRAG-backend

DocuMind RAG (DocuMindRAG)

DocuMindRAG is a multi-tenant (org-scoped) PDF RAG backend with a built-in HITL (Human-in-the-Loop) correction layer.

It lets users upload PDFs, ask questions against their organization’s document corpus, and (for privileged roles) submit “corrections” as patches that override incorrect chunks so the system self-heals over time.

This repository contains a FastAPI backend + Postgres (pgvector) persistence + LangChain/LangGraph RAG pipeline.

What the product does

PDF → searchable knowledge base

Upload PDFs to an organization workspace.
PDFs are split into text chunks and stored in Postgres alongside vector embeddings (pgvector).

Chat / Query (RAG)

Users ask questions via a chat endpoint.
The RAG pipeline:
- contextualizes the user question using chat history
- retrieves relevant context from the org’s vector store
- generates an answer with an LLM
Responses include source objects (chunks and/or patches) with IDs for traceability.

HITL “self-healing” corrections (patches)

Privileged users (Admin/Senior) can create patches which act as corrected versions of content:

Chunk-specific patch: correct a specific PDF chunk
Patch-of-a-patch: create a new patch that supersedes a previous patch (old patch is deactivated)
Org-global patch: add general corrections not attached to a single chunk (still retrieved by similarity)

During retrieval:

Active patches are retrieved by similarity for the org.
Document chunks that already have an active patch are excluded from chunk retrieval.
- This is the core “self-healing” mechanism: the system prefers corrected content and avoids returning the known-bad chunk.

Current implementation notes (important)

Authentication is a dev stub (header-based)

There is no real auth (JWT/OAuth/password login) implemented yet. Instead, the API identifies the “current user” via:

X-Test-Email: <email>

On startup, the app seeds a default organization and 3 test users (if they don’t already exist):

admin@documind.com (role admin)
senior@documind.com (role senior)
junior@documind.com (role viewer)

Organization self-signup / org creation API

The data model supports organizations (organizations table), but this repo does not currently expose endpoints for users to create orgs or invite members. The org is created in startup seeding and users are assigned via the DB.

Role-Based Access Control (RBAC)

Roles are defined as:

admin
senior
viewer (used for “junior” in the seeded test accounts)

Enforced permissions:

Any role (viewer/senior/admin):
- upload PDFs
- list org documents
- chat/query (RAG)
- view own chat sessions and messages
Admin only:
- delete documents
Admin or Senior:
- create patches (HITL corrections)
- list patches
- activate/deactivate patches (rollback)

All document/chunk/patch queries are org-scoped by org_id.

Architecture (high-level)

Components

FastAPI: HTTP API layer (app/main.py, app/api/routes/*)
SQLAlchemy: ORM and sessions (app/db/*)
Postgres + pgvector: storage + ANN-style similarity ordering
LangChain + LangGraph:
- question contextualization
- retrieval
- answer generation

Data flow

Upload

POST /api/v1/upload
PDF is loaded (PyPDFLoader) → split into chunks → each chunk embedded → stored as document_chunks

Ask a question (chat)

POST /api/v1/chat
LangGraph flow:
- contextualize question (uses last 10 messages from the chat session)
- retrieve top-k patches + top-k chunks (org-scoped)
- generate answer with LLM (gpt-4o as currently coded)

Correct the knowledge (HITL)

POST /api/v1/patches (admin/senior)
Store corrected text as a chunk_patches row with embedding
Retrieval will now:
- return patch content (preferred)
- exclude the patched chunk from chunk retrieval results

“Self-healing” retrieval logic (core HITL behavior)

At retrieval time:

Patch retrieval:
- selects chunk_patches where org_id == <org> and is_active == true
- orders by cosine distance to query embedding
Chunk retrieval:
- selects document_chunks where org_id == <org>
- excludes chunks that have any active patch (exists subquery)
- orders by cosine distance to query embedding

The final retrieved context is a concatenation of:

top-k patches (if any)
top-k unpatched chunks

Repository structure

`app/`

main.py: FastAPI app, router registration, DB table creation, and seed users/org
api/
- deps.py: header-based auth + RBAC guards
- schemas.py: Pydantic request/response models (chat + patches)
- routes/
  - docs.py: PDF upload endpoint
  - documents.py: list/delete documents
  - chat.py: chat sessions + messages + RAG chat endpoint
  - admin.py: patch/HITL endpoints
core/
- config.py: settings + env var loading
- security.py: currently empty placeholder
db/
- models.py: SQLAlchemy models (orgs/users/docs/chunks/patches/chats/messages)
- session.py: engine + session factory
rag/
- ingestion.py: PDF loading + chunking + embedding + DB persistence
- retrieval.py: patch + chunk retrieval (pgvector cosine ordering)
- chain.py: LangGraph state machine used by the chat endpoint

Database schema (tables)

Main tables:

organizations
users (belongs to an org)
documents (belongs to an org)
document_chunks (belongs to an org; stores Vector(1536) embeddings)
chunk_patches (belongs to an org; stores Vector(1536) embeddings; supports rollback via is_active)
chat_sessions (belongs to a user)
messages (belongs to a chat session)

Notes:

Embedding columns are Vector(1536). Ensure your embedding model outputs 1536-d vectors.
Similarity ordering uses pgvector cosine distance.

Configuration

Settings are defined in app/core/config.py and can be provided via environment variables or a .env file.

Required (for real usage)

OPENAI_API_KEY

Database

POSTGRES_USER
POSTGRES_PASSWORD
POSTGRES_SERVER
POSTGRES_PORT
POSTGRES_DB

Example `.env`

OPENAI_API_KEY=your_openai_key_here

POSTGRES_USER=user
POSTGRES_PASSWORD=password
POSTGRES_SERVER=localhost
POSTGRES_PORT=5432
POSTGRES_DB=documind

Running locally (recommended for development)

Prerequisites

Python 3.11+
Postgres with pgvector enabled
An OpenAI API key

1) Create and activate a virtual environment

python -m venv venv
# Windows PowerShell:
.\venv\Scripts\Activate.ps1

2) Install dependencies

pip install -r requirements.txt

3) Start Postgres with pgvector

You need a Postgres instance with the pgvector extension available.

One simple approach is a pgvector-enabled Postgres container:

docker run --name documind-postgres -d \
  -e POSTGRES_USER=user \
  -e POSTGRES_PASSWORD=password \
  -e POSTGRES_DB=documind \
  -p 5432:5432 \
  ankane/pgvector

Windows PowerShell variant:

docker run --name documind-postgres -d `
  -e POSTGRES_USER=user `
  -e POSTGRES_PASSWORD=password `
  -e POSTGRES_DB=documind `
  -p 5432:5432 `
  ankane/pgvector

Then, enable the extension (run once):

CREATE EXTENSION IF NOT EXISTS vector;

4) Run the API

uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

Open:

API root: http://localhost:8000/
Health: http://localhost:8000/health

Running with Docker (API container only)

This repo includes a Dockerfile for the API, but does not include a docker-compose.yml. You must provide Postgres separately.

Build

docker build -t documind-rag .

Run

docker run --rm -p 8000:8000 \
  -e OPENAI_API_KEY=your_openai_key_here \
  -e POSTGRES_USER=user \
  -e POSTGRES_PASSWORD=password \
  -e POSTGRES_SERVER=host.docker.internal \
  -e POSTGRES_PORT=5432 \
  -e POSTGRES_DB=documind \
  documind-rag

Windows PowerShell variant:

docker run --rm -p 8000:8000 `
  -e OPENAI_API_KEY=your_openai_key_here `
  -e POSTGRES_USER=user `
  -e POSTGRES_PASSWORD=password `
  -e POSTGRES_SERVER=host.docker.internal `
  -e POSTGRES_PORT=5432 `
  -e POSTGRES_DB=documind `
  documind-rag

If Postgres is running in a container, set POSTGRES_SERVER to that container’s network name on a shared Docker network.

API reference (implemented endpoints)

All routes are mounted under /api/v1.

Auth header (required for almost everything)

Add:

X-Test-Email: admin@documind.com (or the seeded senior/viewer emails)

Windows note:

In PowerShell, curl may be an alias for Invoke-WebRequest. Use curl.exe explicitly, or use Invoke-RestMethod / Invoke-WebRequest.

Upload a PDF

POST /api/v1/upload
Form field: file (must end with .pdf)

Example:

curl -X POST "http://localhost:8000/api/v1/upload" \
  -H "X-Test-Email: admin@documind.com" \
  -F "file=@./some.pdf"

List documents (org-scoped)

GET /api/v1/documents

curl "http://localhost:8000/api/v1/documents" \
  -H "X-Test-Email: admin@documind.com"

Delete a document (admin-only)

DELETE /api/v1/documents/{document_id}

Chat (RAG query)

POST /api/v1/chat

Body:

message: string
chat_id: optional int (continue an existing chat)

curl -X POST "http://localhost:8000/api/v1/chat" \
  -H "Content-Type: application/json" \
  -H "X-Test-Email: junior@documind.com" \
  -d "{\"message\":\"What does this PDF say about pricing?\"}"

Response includes:

response: the answer
sources: list of context items (patches and/or chunks) with IDs
source_summary: short UX-oriented string like “From Patch #17”

List my chats

GET /api/v1/chats

Get messages for a chat

GET /api/v1/chats/{chat_id}/messages

Create a patch (admin/senior only)

POST /api/v1/patches

Rules (as implemented):

Provide either original_chunk_id or patch_id (not both).
If patch_id is provided, the referenced patch is deactivated and the new patch applies to the same original chunk.
If neither is provided, an org-global patch is created.

curl -X POST "http://localhost:8000/api/v1/patches" \
  -H "Content-Type: application/json" \
  -H "X-Test-Email: senior@documind.com" \
  -d "{\"content\":\"Corrected value is 12.5%, not 15%.\", \"original_chunk_id\": 42}"

List patches (admin/senior only)

GET /api/v1/patches?chunk_id=<optional>&active_only=<true|false>

Deactivate (rollback) a patch (admin/senior only)

PATCH /api/v1/patches/{patch_id}/deactivate

Reactivate a patch (admin/senior only)

PATCH /api/v1/patches/{patch_id}/activate

How HITL integrates with PDF RAG (technical summary)

PDFs are ingested into document_chunks with embeddings.
Humans submit corrections as chunk_patches with embeddings.
Retrieval combines:
- similarity search over active patches (preferred, corrected truth)
- similarity search over chunks excluding “patched” chunks (avoid known-bad)
The chat endpoint returns sources including patch IDs and chunk IDs, enabling UI patterns like:
- “show me the evidence”
- “correct this chunk” → create patch for that chunk
- “rollback correction” → deactivate patch

Development / production hardening (recommended next steps)

This repo is an MVP backend. For production:

Replace X-Test-Email header auth with real authentication (JWT/OAuth2) and persistent user management.
Add org creation/invite flows and enforce org membership on signup.
Add migrations (Alembic is listed but not currently used).
Add background ingestion (queue) for large PDFs.
Add rate limiting, request logging, and secrets management.
Add tests around RBAC and retrieval “patch overrides chunk” invariants.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
app		app
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
SystemDesign.md		SystemDesign.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

DocuMind RAG (DocuMindRAG)

What the product does

PDF → searchable knowledge base

Chat / Query (RAG)

HITL “self-healing” corrections (patches)

Current implementation notes (important)

Authentication is a dev stub (header-based)

Organization self-signup / org creation API

Role-Based Access Control (RBAC)

Architecture (high-level)

Components

Data flow

“Self-healing” retrieval logic (core HITL behavior)

Repository structure

app/

Database schema (tables)

Configuration

Required (for real usage)

Database

Example .env

Running locally (recommended for development)

Prerequisites

1) Create and activate a virtual environment

2) Install dependencies

3) Start Postgres with pgvector

4) Run the API

Running with Docker (API container only)

Build

Run

API reference (implemented endpoints)

Auth header (required for almost everything)

Upload a PDF

List documents (org-scoped)

Delete a document (admin-only)

Chat (RAG query)

List my chats

Get messages for a chat

Create a patch (admin/senior only)

List patches (admin/senior only)

Deactivate (rollback) a patch (admin/senior only)

Reactivate a patch (admin/senior only)

How HITL integrates with PDF RAG (technical summary)

Development / production hardening (recommended next steps)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`app/`

Example `.env`

Packages