MegaOCR is a portable OCR application for Windows powered by Tesseract OCR.
It can read PDF / PNG / JPG and export extracted text to .txt / .docx / .pdf.
👉 Download Latest Portable EXE — no installation required, just run and use.
- Works out of the box on Windows — no Python, no Tesseract installation required.
- Supports 120+ languages (common models already bundled in the Release).
- Exports to multiple formats:
.txt,.docx,.pdf. - Clean and simple interface built with Tkinter (using
ttkbootstrap).
- OCR accuracy depends on input quality (high-DPI recommended).
- Complex page layouts may reduce accuracy.
- Current PDF export has limited RTL shaping (Persian/Arabic). Use
.docxor.txtfor best results.
git clone https://github.com/Megahertz418/MegaOCR.git
cd MegaOCR
pip install -r requirements.txt
python Mega_OCR.pyMake sure Tesseract
traineddatafiles for your target languages are available.
pyinstaller --onefile --noconsole ^
--add-data "tesseract.exe;." ^
--add-data "tessdata;tessdata" ^
--add-data "fonts;fonts" ^
--add-data "*.dll;." ^
--add-data "mega_ocr.ico;." ^
--icon=mega_ocr.ico Mega_OCR.pyThe executable will appear at dist/Mega_OCR.exe.
Tip: Use the helper script for clean builds:
scripts\build.ps1 -VendorDir .\vendor
Each Release ships a Vendor Bundle ZIP, which includes:
tesseract.exe+ required*.dll- curated
tessdata/models fonts/mega_ocr.icoandmega_ocr.png- all third-party licenses
MANIFEST.json(components + SHA256 hashes)SHA256SUMS.txt
📥 Available on the Releases page.
Note: End-users don’t need this ZIP. It is only for developers who want to reproduce the official Release build.
MegaOCR/
│ .gitignore
│ CHANGELOG.md
│ CONTRIBUTING.md
│ LICENSE
│ Mega_OCR.py
│ Mega_OCR.spec
│ README.md
│ requirements.txt
│ SECURITY.md
│ THIRD_PARTY_NOTICES.md
│
├── .github/ # GitHub-specific configs (PR/Issue templates)
│ │ pull_request_template.md
│ └── ISSUE_TEMPLATE/
│ bug_report.md
│ feature_request.md
│
├── docs/ # Documentation & media (UI preview, etc.)
│ User Interface.gif
│
└── scripts/ # Build & manifest generation helpers
build.ps1
generate-manifest.ps1
Note:
vendor/,dist/, andbuild/directories are not committed to the repo. They are provided as part of the downloadable Release assets (Vendor Bundle ZIP & EXE).
- Portable EXE (
Mega_OCR.exe) — recommended for most users. - Vendor Bundle ZIP — for developers who want reproducible builds.
Both are available on the Releases page.
- Empty OCR output: Try higher-resolution images or check language settings (e.g.,
eng+fas). - Persian/Arabic shaping in PDF: Use
.docxor.txtinstead. - Antivirus false positive: PyInstaller executables are sometimes flagged. Verify integrity with SHA256 checksums in the Vendor Bundle.
Still stuck? Please open an Issue.
- macOS/Linux support
- Better complex-layout handling
- Optional CLI mode
- Improved RTL shaping in PDF exports
We welcome contributions! See CONTRIBUTING.md.
If you discover a security issue, please follow the process in SECURITY.md.
See CHANGELOG.md for version history.
See THIRD_PARTY_NOTICES.md for a list of included components. Detailed license texts are provided in the Vendor Bundle ZIP.
- MegaOCR code: MIT
- Tesseract & models: Apache-2.0 (see THIRD_PARTY_NOTICES.md)
- Fonts: OFL and CC BY 4.0 (see Vendor Bundle)
