See the appropriate part of the code, it appears to special case on each specific arch: https://github.com/huggingface/text-embeddings-inference/blob/main/Dockerfile-cuda#L50-L63