GPU users: onnxruntime (CPU) overwrites onnxruntime-gpu binaries when both are installed by pip/uv

## Summary

When `fastembed` is used in a GPU environment alongside `onnxruntime-gpu`, the `CUDAExecutionProvider` silently disappears at runtime because pip/uv end up installing **both** `onnxruntime` (CPU) and `onnxruntime-gpu` simultaneously. Since both packages install into the same `site-packages/onnxruntime/` directory, whichever is installed last wins — and in practice the CPU build's `onnxruntime_pybind11_state.so` overwrites the GPU build's, stripping `CUDAExecutionProvider` from the available providers list.

## Root cause

`fastembed` declares a hard dependency on `onnxruntime > 1.20.0` (by name). Until `onnxruntime-gpu` ~1.19.x, the GPU wheel declared `Provides-Dist: onnxruntime` in its metadata, which instructed pip/uv that `onnxruntime-gpu` satisfies any `onnxruntime` requirement. **This metadata is absent from `onnxruntime-gpu >= 1.20.0`** (confirmed in 1.24.2). As a result:

1. uv resolves `onnxruntime > 1.20.0` → installs `onnxruntime==1.24.2` (CPU build)
2. User also has `onnxruntime-gpu==1.24.2` in their project requirements
3. Both install to `site-packages/onnxruntime/`; the CPU `onnxruntime_pybind11_state.so` overwrites the GPU one
4. `ort.get_available_providers()` returns `['CPUExecutionProvider']` instead of `['CUDAExecutionProvider', 'CPUExecutionProvider']`

The failure mode is **silent** — no import error, no warning, just no GPU acceleration.

## Minimal repro (Docker)

```dockerfile
FROM ghcr.io/astral-sh/uv:python3.13-bookworm-slim
# Project that depends on fastembed + onnxruntime-gpu
RUN uv pip install fastembed onnxruntime-gpu
RUN python -c "import onnxruntime as ort; print(ort.get_available_providers())"
# Output: ['CPUExecutionProvider']  <-- CUDAExecutionProvider is gone
```

The `libonnxruntime_providers_cuda.so` is present and all its `.so` dependencies resolve correctly, but it links against `Provider_GetHost` from `libonnxruntime_providers_shared.so` — which is only exported by the GPU pybind11 build, not the CPU one. So `dlopen` of the CUDA provider fails silently.

## Workaround

Reinstall `onnxruntime-gpu` **after** `uv sync` to restore the GPU binaries:

```dockerfile
RUN uv sync --frozen --no-dev
RUN uv pip install --python .venv/bin/python --reinstall "onnxruntime-gpu[cuda,cudnn]==1.24.2"
```

This is fragile (order-dependent, easy to get wrong) and can't be expressed cleanly in `pyproject.toml`.

## Suggested fixes

**Option A — fastembed side (preferred):** Add a `gpu` extra that replaces the `onnxruntime` dep with `onnxruntime-gpu`:

```toml
[project.optional-dependencies]
gpu = ["onnxruntime-gpu"]

[project.dependencies]
# Remove direct onnxruntime pin or make it conditional
```

And guard the import with `try/except` so either package works. This lets GPU users do `pip install "fastembed[gpu]"` and get a coherent environment.

**Option B — onnxruntime side:** Restore `Provides-Dist: onnxruntime` in `onnxruntime-gpu`'s wheel metadata so package managers treat them as interchangeable. This was present in `onnxruntime-gpu <= 1.19.x`. A related issue is tracked at [microsoft/onnxruntime#22107](https://github.com/microsoft/onnxruntime/issues/22107).

## Environment

- `fastembed==0.7.4`, `onnxruntime==1.24.2`, `onnxruntime-gpu==1.24.2`
- Python 3.13, uv 0.6.x
- Docker image: `ghcr.io/astral-sh/uv:python3.13-bookworm-slim`
- Host: NVIDIA RTX 4090 + RTX 5090, driver 570.x, nvidia-container-toolkit 1.17.6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU users: onnxruntime (CPU) overwrites onnxruntime-gpu binaries when both are installed by pip/uv #608

Summary

Root cause

Minimal repro (Docker)

Workaround

Suggested fixes

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

GPU users: onnxruntime (CPU) overwrites onnxruntime-gpu binaries when both are installed by pip/uv #608

Description

Summary

Root cause

Minimal repro (Docker)

Workaround

Suggested fixes

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions