Skip to content

Latest commit

 

History

History

README.md

OCR Examples

This directory contains examples for OCR (Optical Character Recognition) models.

Text detection

# DB (ppocr det)
cargo run -F cuda-full -F vlm --example ocr -- db --device cuda:0 --processor-device cuda:0

# FAST / LinkNet
cargo run -F cuda-full -F vlm --example ocr -- fast --device cuda:0 --processor-device cuda:0
cargo run -F cuda-full -F vlm --example ocr -- linknet --device cuda:0 --processor-device cuda:0

Text recognition

# SVTR
cargo run -F cuda-full -F vlm --example ocr -- svtr --device cuda:0 --processor-device cuda:0 --source ./examples/ocr/images-rec

# TrOCR (module-specific device/dtype)
cargo run -r -F cuda-full -F vlm --example ocr -- trocr --visual-dtype fp16 --visual-device cuda:0 --textual-decoder-dtype fp16 --textual-decoder-device cuda:0 --processor-device cuda:0 --scale s --kind printed --source ./examples/ocr/images-rec

Document layout detection

## DocLayout-YOLO
cargo run -F cuda-full -F vlm --example ocr -- doclayout-yolo --device cuda:0 --processor-device cuda:0 --source images/academic.jpg

## PicoDet-Layout
cargo run -F cuda-full -F vlm --example ocr -- picodet-layout --device cuda:0 --processor-device cuda:0 --source images/academic.jpg

## PP-DocLayout v1/v2
cargo run -F cuda-full -F vlm --example ocr -- pp-doclayout --device cuda:0 --processor-device cuda:0 --source images/academic.jpg --ver 1 --dtype fp32


## PP-DocLayout v3
cargo run -F cuda-full -F vlm --example ocr -- pp-doclayout --device cuda:0 --processor-device cuda:0 --source images/vl1.58.png --ver 3 --dtype fp32

Table structure recognition

# SLANet
cargo run -F cuda-full -F vlm --example ocr -- slanet --device cuda:0 --processor-device cuda:0 --source ./examples/ocr/images-det/table.png

Results