Skip to content

suyashsachdeva/Document_analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 

Repository files navigation

Document_analysis 📝

A Python-based toolkit for document layout analysis and classification, leveraging deep learning (CNNs/Transformers) to detect and segment document elements such as text, figures, tables, headers, and footnotes.

🚀 Features

  • Detects and segments layout components: titles, text blocks, images, tables, etc.
  • Trains on large-scale datasets like PubLayNet.
  • End-to-end pipeline: Data preparation ➜ Model training ➜ Inference ➜ Evaluation.
  • Supports state-of-the-art architectures: Faster R-CNN, Cascade R-CNN, or Transformer-based detectors.
  • Metrics: mAP, IoU for layout components.

🗂️ Table of Contents

  1. Installation
  2. Dataset
  3. Usage
  4. Examples
  5. Project Structure
  6. Configuration
  7. Troubleshooting
  8. License
  9. Contact

Installation

  1. Clone the repo:
    git clone https://github.com/suyashsachdeva/Document_analysis.git
    cd Document_analysis

2. (Recommended) Create a virtual environment:

   ```bash
   python3 -m venv venv
   source venv/bin/activate
  1. Install dependencies:

    pip install -r requirements.txt

🚀 Usage

1. Data Preparation

Prepare and preprocess data:

python scripts/prepare_data.py \
  --input_dir data/publaynet/images \
  --anno_dir data/publaynet/annotations \
  --output_dir processed_data

2. Train the Model

Train a layout detection model:

python train.py \
  --data_dir processed_data \
  --model_dir models/layout_detector \
  --epochs 30 \
  --batch_size 8 \
  --lr 1e-4

3. Run Inference

Detect layouts on new PDF or image files:

python inference.py \
  --model_dir models/layout_detector \
  --input_file samples/sample_page.jpg \
  --output_file results/prediction.json

📸 Examples

Visuals of layout detection overlaid on sample documents can be found in /results/.


🗂️ Project Structure

Document_analysis/
├── data/
│   └── publaynet/
├── processed_data/
├── models/
│   └── layout_detector/
├── scripts/
│   ├── prepare_data.py
│   ├── train.py
│   └── inference.py
├── requirements.txt
└── README.md

⚙️ Configuration

Customize parameters via CLI flags or config files:

  • --epochs, --batch_size, --lr
  • Paths: --data_dir, --model_dir, --input_file, --output_file
  • Backbone model params (ResNet, Transformer)

🛠️ Troubleshooting

  • CUDA errors: Ensure CUDA toolkit and GPU drivers are installed correctly.
  • Slow performance: Reduce batch size or lower backbone resolution.
  • Low accuracy: Check dataset labels, augmentation pipeline, or model depth.

📬 Contact

For queries or issues, open an issue or contact Suyash Sachdeva.


About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published