Skip to content

ihatecsv/deepseek-ocr-client

Repository files navigation

DeepSeek-OCR Client

A real-time Electron-based desktop GUI for DeepSeek-OCR

Unaffiliated with DeepSeek

Features

  • Drag-and-drop image upload
  • Real-time OCR processing

  • Click regions to copy
  • Export results as ZIP with markdown images
  • GPU acceleration (CUDA)

Requirements

  • Windows 10/11, other OS are experimental
  • Node.js 18+ (download)
  • Python 3.12+ (download)
  • NVIDIA GPU with CUDA

Quick Start (Windows)

  1. Extract the ZIP file
  2. Run start-client.bat
    • First run will automatically install dependencies.
    • Subsequent runs will start quicker.
  3. Load Model - Click the "Load Model" button in the app, this will download or load the model.
    • If this is the first run, this might take some time.
  4. Drop an image or click the drop zone to select one.
  5. Run OCR - Click "Run OCR" to process.

Note: if you have issues processing images but the model loads properly, please close and re-open the app and try with the default resolution for "base" and "size". This is a known issue, if you can help to fix it I would appreciate it!

Linux/macOS

Note: Linux and macOS have not been tested yet. Use start-client.sh instead of start-client.bat.

PRs welcome! If you test on Linux/macOS and encounter issues, please open a pull request with fixes.

Links

Future goals (PRs welcome!)

  • Code cleanup needed (quickly put together)
  • TypeScript
  • Updater from GitHub releases
  • PDF support
  • Batch processing
  • CPU support?
  • Web version (so you can run the server on a different machine)
  • Better progress bar algo
  • ???

License

MIT

About

A real-time Electron-based desktop GUI for DeepSeek-OCR

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published