Skip to content

noturbob/doctype.io

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

11 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ“š Doctype.io

AI-Powered Document Intelligence Platform

FastAPI React TypeScript Google AI

Transform your documents into conversations. Upload PDFs and get instant, accurate answers powered by advanced RAG technology.

Features β€’ Quick Start β€’ Demo β€’ API Docs β€’ Contributing


✨ Features

🎯 Core Capabilities

  • πŸ“„ Smart PDF Processing - Upload and parse documents instantly
  • πŸ€– AI-Powered Q&A - Natural language queries with context-aware responses
  • 🧠 RAG Architecture - Retrieval-Augmented Generation for accurate answers
  • πŸ’Ύ Vector Storage - Efficient document embeddings with Upstash

πŸ”§ Technical Features

  • πŸ” Secure Authentication - Powered by Clerk
  • ⚑ Real-time Processing - Fast document ingestion and retrieval
  • πŸ“Š Interactive API Docs - Built-in Swagger UI
  • 🎨 Modern UI - Smooth animations with Framer Motion

πŸ—οΈ Architecture

graph LR
    A[User] --> B[React Frontend]
    B --> C[FastAPI Backend]
    C --> D[LangChain RAG]
    D --> E[Google Gemini]
    D --> F[Upstash Vector DB]
    C --> G[Clerk Auth]
Loading

πŸ› οΈ Tech Stack

Backend Technologies
Technology Purpose
FastAPI High-performance API framework
LangChain RAG orchestration & chains
Google AI Embeddings & chat completions
Upstash Serverless vector database
PyPDF PDF parsing & extraction
Frontend Technologies
Technology Purpose
React UI framework
TypeScript Type-safe development
Tailwind Utility-first CSS
Framer Motion Animation library
Clerk Authentication & user management

πŸš€ Quick Start

Prerequisites

  • Python 3.8+
  • Node.js 16+
  • npm or yarn

βš™οΈ Backend Setup

# Navigate to backend directory
cd backend

# Create and activate virtual environment
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Configure environment variables
cp .env.example .env
# Edit .env with your API keys (see Environment Variables section)

# Start the server
uvicorn app.main:app --reload

🌐 Backend runs on: http://127.0.0.1:8000

🎨 Frontend Setup

# Navigate to frontend directory
cd frontend

# Install dependencies
npm install

# Configure environment variables
cp .env.example .env
# Edit .env with your Clerk key

# Start development server
npm start

🌐 Frontend runs on: http://localhost:3000


πŸ”‘ Environment Variables

Backend Configuration (.env)
# Google AI
GOOGLE_API_KEY=your_google_api_key_here

# Upstash Vector Database
UPSTASH_VECTOR_REST_URL=your_upstash_url_here
UPSTASH_VECTOR_REST_TOKEN=your_upstash_token_here

# Clerk Authentication
CLERK_SECRET_KEY=your_clerk_secret_key_here

# CORS
FRONTEND_URL=http://localhost:3000

πŸ”— Get Your API Keys:

Frontend Configuration (.env)
# API Configuration
REACT_APP_API_URL=http://127.0.0.1:8000

# Clerk Authentication
REACT_APP_CLERK_PUBLISHABLE_KEY=your_clerk_publishable_key_here

🎯 Usage

  1. πŸ” Sign In - Authenticate using Clerk
  2. πŸ“€ Upload PDF - Drop your document or click to upload
  3. πŸ’¬ Ask Questions - Type your questions in natural language
  4. ✨ Get Answers - Receive AI-powered responses with context

πŸ“‘ API Documentation

Interactive API documentation is automatically generated and available at:

πŸ”— Swagger UI: http://127.0.0.1:8000/docs

Main Endpoints

Method Endpoint Description
GET / Health check & API status
POST /ingest Upload and process PDF documents
POST /chat Query documents with natural language

Example Request

curl -X POST "http://127.0.0.1:8000/chat" \
  -H "Content-Type: application/json" \
  -d '{
    "question": "What is the main topic of this document?",
    "session_id": "user123"
  }'

⚠️ Rate Limits

Google’s free tier includes the following limits:

Limit Type Value
Daily Requests 1,500
Requests per Minute 15

The system includes built-in rate limiting and automatic retry logic to handle these limits gracefully.


πŸ—ΊοΈ Roadmap

  • Support for multiple document formats (DOCX, TXT, etc.)
  • Multi-document querying
  • Export conversation history
  • Custom AI model selection
  • Advanced search filters
  • Document summarization
  • Mobile app

🀝 Contributing

Contributions are what make the open-source community amazing! Any contributions you make are greatly appreciated.

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

πŸ“„ License

Distributed under the MIT License. See LICENSE for more information.


πŸ™ Acknowledgments

Special thanks to these amazing technologies:


⭐ Star this repo if you find it helpful!

Made with ❀️ by noturbob

Report Bug β€’ Request Feature

About

Doctype.io: A production-ready RAG engine that turns static PDFs into intelligent conversations. Built with FastAPI, Redis, LangChain, and Google Gemini.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Contributors