Skip to content

Enable GPU for ONNX runtime on GNU/Linux.#20522

Merged
TurboGit merged 1 commit intomasterfrom
po/onnx-gpu
Mar 15, 2026
Merged

Enable GPU for ONNX runtime on GNU/Linux.#20522
TurboGit merged 1 commit intomasterfrom
po/onnx-gpu

Conversation

@TurboGit
Copy link
Member

  • First we need the ONNX runtime with support for GPU.
  • And ensure that all ONNX runtime library are copied.

Closes #20517

- First we need the ONNX runtime with support for GPU.
- And ensure that all ONNX runtime library are copied.

Closes #20517
@TurboGit TurboGit added this to the 5.6 milestone Mar 14, 2026
@TurboGit TurboGit added feature: enhancement current features to improve priority: low core features work as expected, only secondary/optional features don't release notes: pending labels Mar 14, 2026
@TurboGit
Copy link
Member Author

@andriiryzhkov : Can you review/comment?

On my side I have been able to activate the GPU for ONNX but it is the same speed as CPU and fails anyway for the large SegNext model (I don't have much memory - 4Gb - on my GPU).

Using ONNX Cuda requires also many other OS libs:

  • libcublas12
  • libcublaslt12
  • libcurand10
  • libcufft11
  • libcudart12
  • nvidia-cudnn

@KarlMagnusLarsson
Copy link

Works for me: #20517 (comment)

@andriiryzhkov
Copy link
Contributor

@TurboGit: your change makes sense. I see that major Linux distros does not have packages for ONNX Runtime with GPU acceleration. The reasonable fallback in this situation is to use pre-built package from GitHub as you suggested. Or rely on user to build ONNX Runtime with GPU acceleration themself.

Unfortunately, I don't have NVIDIA GPU to test it myself. So, @KarlMagnusLarsson and @TurboGit I would need your help here.

@andriiryzhkov
Copy link
Contributor

Runtime dependencies we would need documenting. The CUDA EP requires the user to install:

  • NVIDIA driver
  • CUDA 12 runtime libraries (libcudart, libcublas, libcufft, libcurand) — not the full CUDA Toolkit
  • cuDNN 9 (libcudnn.so.9) — major version must match exactly what ORT was built against

On Ubuntu: cuda-libraries-12-x libcudnn9-cuda-12. On Fedora: cuda-libs libcudnn. Without cuDNN 9 specifically, the CUDA EP silently falls back to CPU with no obvious error.

Some licensing notes (not blockers, but worth being aware of):

The ORT libraries we ship (libonnxruntime*.so) are MIT-licensed — no issue there. The concern is the cuDNN runtime dependency.

The cuDNN SLA prohibits use "in any manner that would cause it to become subject to an open source software license." This is a theoretical tension with GPLv3, since under a strict reading, dynamic linking creates a combined work. However:

  1. We don't ship cuDNN — the user installs it themselves. GPL governs distribution, not what users run on their own machines.
  2. digiKam 8.7 (GPL-2+, released June 2025) does exactly the same thing — cuDNN-dependent GPU inference via OpenCV DNN — and shipped it without controversy.
  3. This is the same grey area darktable already lives in with proprietary OpenCL drivers.

The practical community consensus supports this approach. Worth adding a note in the docs stating that GPU acceleration requires user-installed proprietary NVIDIA libraries not covered by darktable's GPL grant.

@TurboGit
Copy link
Member Author

Sure the cuDNN is something to discuss. Good to see DigiKam already distributing it.

This is the same grey area darktable already lives in with proprietary OpenCL drivers.

In this case though it is up to the user to install it. I would lean toward the same for cudnn for now. Let people wanting to have ONNX with CUDA install it by themselves.

@TurboGit TurboGit merged commit 807431f into master Mar 15, 2026
6 checks passed
@TurboGit TurboGit deleted the po/onnx-gpu branch March 15, 2026 14:57
@TurboGit
Copy link
Member Author

@andriiryzhkov : See #20532.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature: enhancement current features to improve priority: low core features work as expected, only secondary/optional features don't release notes: pending

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Git: master: AI object mask tool : NVIDIA CUDA not available, will fall back to CPU

3 participants