Skip to content

Drexter-07/DocumindRAG-backend

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DocuMind RAG (DocuMindRAG)

DocuMindRAG is a multi-tenant (org-scoped) PDF RAG backend with a built-in HITL (Human-in-the-Loop) correction layer.

It lets users upload PDFs, ask questions against their organization’s document corpus, and (for privileged roles) submit “corrections” as patches that override incorrect chunks so the system self-heals over time.

This repository contains a FastAPI backend + Postgres (pgvector) persistence + LangChain/LangGraph RAG pipeline.


What the product does

PDF → searchable knowledge base

  • Upload PDFs to an organization workspace.
  • PDFs are split into text chunks and stored in Postgres alongside vector embeddings (pgvector).

Chat / Query (RAG)

  • Users ask questions via a chat endpoint.
  • The RAG pipeline:
    • contextualizes the user question using chat history
    • retrieves relevant context from the org’s vector store
    • generates an answer with an LLM
  • Responses include source objects (chunks and/or patches) with IDs for traceability.

HITL “self-healing” corrections (patches)

Privileged users (Admin/Senior) can create patches which act as corrected versions of content:

  • Chunk-specific patch: correct a specific PDF chunk
  • Patch-of-a-patch: create a new patch that supersedes a previous patch (old patch is deactivated)
  • Org-global patch: add general corrections not attached to a single chunk (still retrieved by similarity)

During retrieval:

  • Active patches are retrieved by similarity for the org.
  • Document chunks that already have an active patch are excluded from chunk retrieval.
    • This is the core “self-healing” mechanism: the system prefers corrected content and avoids returning the known-bad chunk.

Current implementation notes (important)

Authentication is a dev stub (header-based)

There is no real auth (JWT/OAuth/password login) implemented yet. Instead, the API identifies the “current user” via:

  • X-Test-Email: <email>

On startup, the app seeds a default organization and 3 test users (if they don’t already exist):

  • admin@documind.com (role admin)
  • senior@documind.com (role senior)
  • junior@documind.com (role viewer)

Organization self-signup / org creation API

The data model supports organizations (organizations table), but this repo does not currently expose endpoints for users to create orgs or invite members. The org is created in startup seeding and users are assigned via the DB.


Role-Based Access Control (RBAC)

Roles are defined as:

  • admin
  • senior
  • viewer (used for “junior” in the seeded test accounts)

Enforced permissions:

  • Any role (viewer/senior/admin):
    • upload PDFs
    • list org documents
    • chat/query (RAG)
    • view own chat sessions and messages
  • Admin only:
    • delete documents
  • Admin or Senior:
    • create patches (HITL corrections)
    • list patches
    • activate/deactivate patches (rollback)

All document/chunk/patch queries are org-scoped by org_id.


Architecture (high-level)

Components

  • FastAPI: HTTP API layer (app/main.py, app/api/routes/*)
  • SQLAlchemy: ORM and sessions (app/db/*)
  • Postgres + pgvector: storage + ANN-style similarity ordering
  • LangChain + LangGraph:
    • question contextualization
    • retrieval
    • answer generation

Data flow

  1. Upload
  • POST /api/v1/upload
  • PDF is loaded (PyPDFLoader) → split into chunks → each chunk embedded → stored as document_chunks
  1. Ask a question (chat)
  • POST /api/v1/chat
  • LangGraph flow:
    • contextualize question (uses last 10 messages from the chat session)
    • retrieve top-k patches + top-k chunks (org-scoped)
    • generate answer with LLM (gpt-4o as currently coded)
  1. Correct the knowledge (HITL)
  • POST /api/v1/patches (admin/senior)
  • Store corrected text as a chunk_patches row with embedding
  • Retrieval will now:
    • return patch content (preferred)
    • exclude the patched chunk from chunk retrieval results

“Self-healing” retrieval logic (core HITL behavior)

At retrieval time:

  • Patch retrieval:
    • selects chunk_patches where org_id == <org> and is_active == true
    • orders by cosine distance to query embedding
  • Chunk retrieval:
    • selects document_chunks where org_id == <org>
    • excludes chunks that have any active patch (exists subquery)
    • orders by cosine distance to query embedding

The final retrieved context is a concatenation of:

  • top-k patches (if any)
  • top-k unpatched chunks

Repository structure

app/

  • main.py: FastAPI app, router registration, DB table creation, and seed users/org
  • api/
    • deps.py: header-based auth + RBAC guards
    • schemas.py: Pydantic request/response models (chat + patches)
    • routes/
      • docs.py: PDF upload endpoint
      • documents.py: list/delete documents
      • chat.py: chat sessions + messages + RAG chat endpoint
      • admin.py: patch/HITL endpoints
  • core/
    • config.py: settings + env var loading
    • security.py: currently empty placeholder
  • db/
    • models.py: SQLAlchemy models (orgs/users/docs/chunks/patches/chats/messages)
    • session.py: engine + session factory
  • rag/
    • ingestion.py: PDF loading + chunking + embedding + DB persistence
    • retrieval.py: patch + chunk retrieval (pgvector cosine ordering)
    • chain.py: LangGraph state machine used by the chat endpoint

Database schema (tables)

Main tables:

  • organizations
  • users (belongs to an org)
  • documents (belongs to an org)
  • document_chunks (belongs to an org; stores Vector(1536) embeddings)
  • chunk_patches (belongs to an org; stores Vector(1536) embeddings; supports rollback via is_active)
  • chat_sessions (belongs to a user)
  • messages (belongs to a chat session)

Notes:

  • Embedding columns are Vector(1536). Ensure your embedding model outputs 1536-d vectors.
  • Similarity ordering uses pgvector cosine distance.

Configuration

Settings are defined in app/core/config.py and can be provided via environment variables or a .env file.

Required (for real usage)

  • OPENAI_API_KEY

Database

  • POSTGRES_USER
  • POSTGRES_PASSWORD
  • POSTGRES_SERVER
  • POSTGRES_PORT
  • POSTGRES_DB

Example .env

OPENAI_API_KEY=your_openai_key_here

POSTGRES_USER=user
POSTGRES_PASSWORD=password
POSTGRES_SERVER=localhost
POSTGRES_PORT=5432
POSTGRES_DB=documind

Running locally (recommended for development)

Prerequisites

  • Python 3.11+
  • Postgres with pgvector enabled
  • An OpenAI API key

1) Create and activate a virtual environment

python -m venv venv
# Windows PowerShell:
.\venv\Scripts\Activate.ps1

2) Install dependencies

pip install -r requirements.txt

3) Start Postgres with pgvector

You need a Postgres instance with the pgvector extension available.

One simple approach is a pgvector-enabled Postgres container:

docker run --name documind-postgres -d \
  -e POSTGRES_USER=user \
  -e POSTGRES_PASSWORD=password \
  -e POSTGRES_DB=documind \
  -p 5432:5432 \
  ankane/pgvector

Windows PowerShell variant:

docker run --name documind-postgres -d `
  -e POSTGRES_USER=user `
  -e POSTGRES_PASSWORD=password `
  -e POSTGRES_DB=documind `
  -p 5432:5432 `
  ankane/pgvector

Then, enable the extension (run once):

CREATE EXTENSION IF NOT EXISTS vector;

4) Run the API

uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

Open:

  • API root: http://localhost:8000/
  • Health: http://localhost:8000/health

Running with Docker (API container only)

This repo includes a Dockerfile for the API, but does not include a docker-compose.yml. You must provide Postgres separately.

Build

docker build -t documind-rag .

Run

docker run --rm -p 8000:8000 \
  -e OPENAI_API_KEY=your_openai_key_here \
  -e POSTGRES_USER=user \
  -e POSTGRES_PASSWORD=password \
  -e POSTGRES_SERVER=host.docker.internal \
  -e POSTGRES_PORT=5432 \
  -e POSTGRES_DB=documind \
  documind-rag

Windows PowerShell variant:

docker run --rm -p 8000:8000 `
  -e OPENAI_API_KEY=your_openai_key_here `
  -e POSTGRES_USER=user `
  -e POSTGRES_PASSWORD=password `
  -e POSTGRES_SERVER=host.docker.internal `
  -e POSTGRES_PORT=5432 `
  -e POSTGRES_DB=documind `
  documind-rag

If Postgres is running in a container, set POSTGRES_SERVER to that container’s network name on a shared Docker network.


API reference (implemented endpoints)

All routes are mounted under /api/v1.

Auth header (required for almost everything)

Add:

  • X-Test-Email: admin@documind.com (or the seeded senior/viewer emails)

Windows note:

  • In PowerShell, curl may be an alias for Invoke-WebRequest. Use curl.exe explicitly, or use Invoke-RestMethod / Invoke-WebRequest.

Upload a PDF

  • POST /api/v1/upload
  • Form field: file (must end with .pdf)

Example:

curl -X POST "http://localhost:8000/api/v1/upload" \
  -H "X-Test-Email: admin@documind.com" \
  -F "file=@./some.pdf"

List documents (org-scoped)

  • GET /api/v1/documents
curl "http://localhost:8000/api/v1/documents" \
  -H "X-Test-Email: admin@documind.com"

Delete a document (admin-only)

  • DELETE /api/v1/documents/{document_id}

Chat (RAG query)

  • POST /api/v1/chat

Body:

  • message: string
  • chat_id: optional int (continue an existing chat)
curl -X POST "http://localhost:8000/api/v1/chat" \
  -H "Content-Type: application/json" \
  -H "X-Test-Email: junior@documind.com" \
  -d "{\"message\":\"What does this PDF say about pricing?\"}"

Response includes:

  • response: the answer
  • sources: list of context items (patches and/or chunks) with IDs
  • source_summary: short UX-oriented string like “From Patch #17”

List my chats

  • GET /api/v1/chats

Get messages for a chat

  • GET /api/v1/chats/{chat_id}/messages

Create a patch (admin/senior only)

  • POST /api/v1/patches

Rules (as implemented):

  • Provide either original_chunk_id or patch_id (not both).
  • If patch_id is provided, the referenced patch is deactivated and the new patch applies to the same original chunk.
  • If neither is provided, an org-global patch is created.
curl -X POST "http://localhost:8000/api/v1/patches" \
  -H "Content-Type: application/json" \
  -H "X-Test-Email: senior@documind.com" \
  -d "{\"content\":\"Corrected value is 12.5%, not 15%.\", \"original_chunk_id\": 42}"

List patches (admin/senior only)

  • GET /api/v1/patches?chunk_id=<optional>&active_only=<true|false>

Deactivate (rollback) a patch (admin/senior only)

  • PATCH /api/v1/patches/{patch_id}/deactivate

Reactivate a patch (admin/senior only)

  • PATCH /api/v1/patches/{patch_id}/activate

How HITL integrates with PDF RAG (technical summary)

  • PDFs are ingested into document_chunks with embeddings.
  • Humans submit corrections as chunk_patches with embeddings.
  • Retrieval combines:
    • similarity search over active patches (preferred, corrected truth)
    • similarity search over chunks excluding “patched” chunks (avoid known-bad)
  • The chat endpoint returns sources including patch IDs and chunk IDs, enabling UI patterns like:
    • “show me the evidence”
    • “correct this chunk” → create patch for that chunk
    • “rollback correction” → deactivate patch

Development / production hardening (recommended next steps)

This repo is an MVP backend. For production:

  • Replace X-Test-Email header auth with real authentication (JWT/OAuth2) and persistent user management.
  • Add org creation/invite flows and enforce org membership on signup.
  • Add migrations (Alembic is listed but not currently used).
  • Add background ingestion (queue) for large PDFs.
  • Add rate limiting, request logging, and secrets management.
  • Add tests around RBAC and retrieval “patch overrides chunk” invariants.

About

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors