Automatically label computer vision datasets with zero or near-zero manual cost. A Python library that generates high-quality pseudo-labels for image classification, object detection, instance segmentation, OCR, visual question answering, and feature matching using state-of-the-art models (CLIP, OWL-ViT, SAM, LoFTR, VLMs, TrOCR, and more).
| Author | KuchikiRenji |
| KuchikiRenji@outlook.com | |
| GitHub | github.com/KuchikiRenji |
| Discord | kuchiki_renji |
- What is Auto Labeler?
- Features
- Supported Tasks & Models
- Installation
- Quick Start
- Usage by Task
- Roadmap
- License & Contributing
Auto Labeler is a simple, modular Python framework for automatic dataset labeling and pseudo-label generation in computer vision. It wraps popular frameworks (Hugging Face, OpenCLIP, Kornia, etc.) and exposes a uniform interface so you can:
- Label images for classification (image-to-image or text-to-image retrieval)
- Generate object detection and instance segmentation labels (zero-shot or prompt-based)
- Run visual question answering and OCR on images
- Do feature/keypoint matching for retrieval and correspondence
Minimal configuration and a single label.py entry point per task keep manual effort low while leveraging SOTA architectures.
- High abstraction — One interface over Hugging Face, OpenCLIP, and other SOTA sources; less boilerplate for researchers and teams.
- Modular design — Separate modules per vision task; easy to add or swap algorithms.
- Minimal touchpoints — Set config (model, weights, hyperparameters) and run
label.py; no deep integration work. - Multiple vision tasks — Classification, detection, segmentation, VQA, OCR, and feature matching in one repo.
| Task | Models / Architectures |
|---|---|
| Image Classification | CLIP (OpenCLIP) |
| Object Detection | OWL-ViT-v2 |
| Instance Segmentation | Segment Anything (SAM) |
| Visual Question Answering | LLaVA-NeXT, SmolVLM, PaliGemma2, Qwen2-VL, BLIP |
| Feature Matching | LoFTR (Kornia) |
| OCR | TrOCR, docTR (Mindee) |
-
Clone the repository
git clone https://github.com/KuchikiRenji/auto_labeler.git cd auto_labeler -
Create a virtual environment (Python 3.8+ recommended)
python -m venv .venv source .venv/bin/activate # Linux/macOS # .venv\Scripts\activate # Windows
-
Install dependencies
pip install -r requirements.txt
Each task has a label.py in its folder. General pattern:
cd <task_folder> # e.g. image_classification, object_detection, ocr
python label.py --unlabelled-dump <path_to_images> --result-path <output_path> [other options]See Usage by Task for exact commands and options.
cd image_classification
python label.py \
--unlabelled-dump 'path/to/unlabelled/images' \
--class2prompts 'path/to/class_prompts.json' \
--result-path 'path/to/save/labels'cd object_detection
python label.py \
--unlabelled-dump 'path/to/images' \
--class-texts-path 'path/to/class_objects.json' \
--prompt-images 'path/to/prompt_images' \
--result-path 'path/to/detection_results.json' \
--viz False \
--viz-path 'path/to/bbox_viz'cd instance_segmentation
python label.py \
--unlabelled-dump 'path/to/images' \
--class-texts-path 'path/to/class_objects.json' \
--result-path 'path/to/segmentation_results.pkl' \
--viz False \
--viz-path 'path/to/mask_viz'cd visual_question_answering
python label.py \
--unlabelled-dump 'path/to/images' \
--result-path 'path/to/vqa_results.json'cd feature_matching
python label.py \
--unlabelled-dump 'path/to/images' \
--reference-images 'path/to/reference/index_images' \
--result-path 'path/to/matching_results'cd ocr
python label.py \
--unlabelled-dump 'path/to/document_images' \
--result-path 'path/to/ocr_results.json'- Config-driven prompting for VLMs
- Visualization support for LoFTR
- Support for SuperGlue, SIFT, SURF and other classical feature matching methods
For issues, feature requests, or contributions, open an issue or PR on GitHub or reach out via the contact details above.
Auto Labeler — automatic computer vision dataset labeling and pseudo-label generation. Author: KuchikiRenji | GitHub | Email