Tools_DeepSeekOCR

This project implements a screenshot utility based on DeepSeek-OCR, enabling model deployment on the Windows platform for Optical Character Recognition (OCR) directly from screen captures.

[ 中文 | English ]

System Prerequisites

Before proceeding with the installation, please ensure the following dependencies are installed:

Python >= 3.9 1.1 Python Website: https://www.python.org/downloads/release/python-3140/
Scroll to the bottom of the page to find the table labeled "Files," which contains installation files for various versions. For Windows 64-bit, use the Windows installer (64-bit).
CUDA (Install the CUDA driver corresponding to your graphics card)
2.1 CUDA Website: https://developer.nvidia.com/cuda-downloads
Download and install the CUDA Toolkit Installer based on your device's operating system version.
(Optional) Git
3.1 Git Website: https://git-scm.com/

Note: Empirical testing indicates that approximately 7GB of VRAM (Video RAM) is consumed during use.

Installation

Download the project code: git clone https://github.com/reuAC/Tools_DeepSeekOCR
Navigate to the project directory: cd Tools_DeepSeekOCR
Create a virtual environment for the project: python -m venv venv
Activate the virtual environment: venv\Scripts\activate.bat
Download the DeepSeek-OCR model files into the model directory within the project folder.
3.1 Using ModelScope: modelscope download --model deepseek-ai/DeepSeek-OCR --local_dir ./model
Install the environment dependencies:

# Dependencies required by DeepSeek-OCR
pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cu118
pip install transformers==4.46.3 tokenizers==0.20.3 einops easydict addict

# Additional dependencies for this project
pip install mss pynput screeninfo

After installation is complete, run the application using: python main.py

Key Usage Points

Upon launching the main program, the model will automatically be loaded onto the GPU. The menu will be displayed once loading is complete.
The default screenshot hotkey is Ctrl + Shift + X.
Screenshots and configuration settings will be temporarily saved in the project directory.
During screenshot recognition, the model's output will be displayed as a stream. The model's complete, final output will be shown once recognition is finished.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
LICENSE		LICENSE
README.md		README.md
README_CN.md		README_CN.md
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

Tools_DeepSeekOCR

System Prerequisites

Installation

Key Usage Points

About

Uh oh!

Releases

Packages

Languages

Uh oh!

License

Uh oh!

reuAC/Tools_DeepSeekOCR

Folders and files

Latest commit

History

Repository files navigation

Tools_DeepSeekOCR

System Prerequisites

Installation

Key Usage Points

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages