A Python-based toolkit for document layout analysis and classification, leveraging deep learning (CNNs/Transformers) to detect and segment document elements such as text, figures, tables, headers, and footnotes.
- Detects and segments layout components: titles, text blocks, images, tables, etc.
- Trains on large-scale datasets like PubLayNet.
- End-to-end pipeline: Data preparation ➜ Model training ➜ Inference ➜ Evaluation.
- Supports state-of-the-art architectures: Faster R-CNN, Cascade R-CNN, or Transformer-based detectors.
- Metrics: mAP, IoU for layout components.
- Clone the repo:
git clone https://github.com/suyashsachdeva/Document_analysis.git cd Document_analysis
2. (Recommended) Create a virtual environment:
```bash
python3 -m venv venv
source venv/bin/activate
-
Install dependencies:
pip install -r requirements.txt
Prepare and preprocess data:
python scripts/prepare_data.py \
--input_dir data/publaynet/images \
--anno_dir data/publaynet/annotations \
--output_dir processed_dataTrain a layout detection model:
python train.py \
--data_dir processed_data \
--model_dir models/layout_detector \
--epochs 30 \
--batch_size 8 \
--lr 1e-4Detect layouts on new PDF or image files:
python inference.py \
--model_dir models/layout_detector \
--input_file samples/sample_page.jpg \
--output_file results/prediction.jsonVisuals of layout detection overlaid on sample documents can be found in /results/.
Document_analysis/
├── data/
│ └── publaynet/
├── processed_data/
├── models/
│ └── layout_detector/
├── scripts/
│ ├── prepare_data.py
│ ├── train.py
│ └── inference.py
├── requirements.txt
└── README.md
Customize parameters via CLI flags or config files:
--epochs,--batch_size,--lr- Paths:
--data_dir,--model_dir,--input_file,--output_file - Backbone model params (ResNet, Transformer)
- CUDA errors: Ensure CUDA toolkit and GPU drivers are installed correctly.
- Slow performance: Reduce batch size or lower backbone resolution.
- Low accuracy: Check dataset labels, augmentation pipeline, or model depth.
For queries or issues, open an issue or contact Suyash Sachdeva.