Skip to content

ai.djl.engine.EngineException: Could not run 'aten::empty_strided' with arguments from the 'CUDA' backend #3806

@akumaigorodski

Description

@akumaigorodski

I'm having this exception while preparing features for a training, the system is Ubuntu 24.04, here's the relevant info:

// build.sbt:

libraryDependencies += "ai.djl" % "api" % "0.34.0"

libraryDependencies += "ai.djl.pytorch" % "pytorch-engine" % "0.34.0"

libraryDependencies += "ai.djl.pytorch" % "pytorch-jni" % "2.7.1-0.34.0"

libraryDependencies += ("ai.djl.pytorch" % "pytorch-native-cu124" % "2.5.1").classifier("linux-x86_64")
// Where this happens in code:
println(s"targetShape=$targetShape")
featureNDArrays :+= manager.create(splitFeatures.toArray).reshape(targetShape).toType(DataType.FLOAT32, false)
valueNDArrays :+= manager.create(splitValues.toArray).toType(DataType.FLOAT32, false)
...
targetShape=(-1, 48, 10)
ai.djl.engine.EngineException: Could not run 'aten::empty_strided' with arguments from the 'CUDA' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'aten::empty_strided' is only available for these backends: [CPU, Meta, QuantizedCPU, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradHIP, AutogradXLA, AutogradMPS, AutogradIPU, AutogradXPU, AutogradHPU, AutogradVE, AutogradLazy, AutogradMTIA, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, AutogradMeta, AutogradNestedTensor, Tracer, AutocastCPU, AutocastMTIA, AutocastXPU, AutocastMPS, AutocastCUDA, FuncTorchBatched, BatchedNestedTensor, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PreDispatch, PythonDispatcher].

CPU: registered at /pytorch/build/aten/src/ATen/RegisterCPU_2.cpp:2484 [kernel]
Meta: registered at /pytorch/build/aten/src/ATen/RegisterMeta_0.cpp:5517 [kernel]
QuantizedCPU: registered at /pytorch/build/aten/src/ATen/RegisterQuantizedCPU_0.cpp:485 [kernel]
BackendSelect: registered at /pytorch/build/aten/src/ATen/RegisterBackendSelect.cpp:792 [kernel]
Python: registered at /pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:194 [backend fallback]
FuncTorchDynamicLayerBackMode: registered at /pytorch/aten/src/ATen/functorch/DynamicLayer.cpp:479 [backend fallback]
Functionalize: registered at /pytorch/aten/src/ATen/FunctionalizeFallbackKernel.cpp:349 [backend fallback]
Named: registered at /pytorch/aten/src/ATen/core/NamedRegistrations.cpp:7 [backend fallback]
Conjugate: fallthrough registered at /pytorch/aten/src/ATen/ConjugateFallback.cpp:21 [kernel]
Negative: fallthrough registered at /pytorch/aten/src/ATen/native/NegateFallback.cpp:22 [kernel]
ZeroTensor: fallthrough registered at /pytorch/aten/src/ATen/ZeroTensorFallback.cpp:90 [kernel]
ADInplaceOrView: fallthrough registered at /pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:100 [backend fallback]
AutogradOther: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:20142 [autograd kernel]
AutogradCPU: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:20142 [autograd kernel]
AutogradCUDA: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:20142 [autograd kernel]
AutogradHIP: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:20142 [autograd kernel]
AutogradXLA: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:20142 [autograd kernel]
AutogradMPS: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:20142 [autograd kernel]
AutogradIPU: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:20142 [autograd kernel]
AutogradXPU: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:20142 [autograd kernel]
AutogradHPU: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:20142 [autograd kernel]
AutogradVE: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:20142 [autograd kernel]
AutogradLazy: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:20142 [autograd kernel]
AutogradMTIA: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:20142 [autograd kernel]
AutogradPrivateUse1: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:20142 [autograd kernel]
AutogradPrivateUse2: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:20142 [autograd kernel]
AutogradPrivateUse3: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:20142 [autograd kernel]
AutogradMeta: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:20142 [autograd kernel]
AutogradNestedTensor: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:20142 [autograd kernel]
Tracer: registered at /pytorch/torch/csrc/autograd/generated/TraceType_2.cpp:17801 [kernel]
AutocastCPU: fallthrough registered at /pytorch/aten/src/ATen/autocast_mode.cpp:322 [backend fallback]
AutocastMTIA: fallthrough registered at /pytorch/aten/src/ATen/autocast_mode.cpp:466 [backend fallback]
AutocastXPU: fallthrough registered at /pytorch/aten/src/ATen/autocast_mode.cpp:504 [backend fallback]
AutocastMPS: fallthrough registered at /pytorch/aten/src/ATen/autocast_mode.cpp:209 [backend fallback]
AutocastCUDA: fallthrough registered at /pytorch/aten/src/ATen/autocast_mode.cpp:165 [backend fallback]
FuncTorchBatched: registered at /pytorch/aten/src/ATen/functorch/LegacyBatchingRegistrations.cpp:731 [backend fallback]
BatchedNestedTensor: registered at /pytorch/aten/src/ATen/functorch/LegacyBatchingRegistrations.cpp:758 [backend fallback]
FuncTorchVmapMode: fallthrough registered at /pytorch/aten/src/ATen/functorch/VmapModeRegistrations.cpp:27 [backend fallback]
Batched: registered at /pytorch/aten/src/ATen/LegacyBatchingRegistrations.cpp:1075 [backend fallback]
VmapMode: fallthrough registered at /pytorch/aten/src/ATen/VmapModeRegistrations.cpp:33 [backend fallback]
FuncTorchGradWrapper: registered at /pytorch/aten/src/ATen/functorch/TensorWrapper.cpp:208 [backend fallback]
PythonTLSSnapshot: registered at /pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:202 [backend fallback]
FuncTorchDynamicLayerFrontMode: registered at /pytorch/aten/src/ATen/functorch/DynamicLayer.cpp:475 [backend fallback]
PreDispatch: registered at /pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:206 [backend fallback]
PythonDispatcher: registered at /pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:198 [backend fallback]

	at ai.djl.pytorch.jni.PyTorchLibrary.torchFromBlob(Native Method)
	at ai.djl.pytorch.jni.JniUtils.createNdFromByteBuffer(JniUtils.java:162)
	at ai.djl.pytorch.engine.PtNDManager.create(PtNDManager.java:77)
	at ai.djl.pytorch.engine.PtNDManager.create(PtNDManager.java:31)
	at ai.djl.ndarray.NDManager.create(NDManager.java:510)
	at ai.djl.ndarray.NDManager.create(NDManager.java:449)
	at fovea.utils.Common$.$anonfun$featuresToNDArray$1(Common.scala:40)
	at fovea.utils.Common$.$anonfun$featuresToNDArray$1$adapted(Common.scala:37)
$ nvidia-smi
Fri Sep 26 19:53:24 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.65.06              Driver Version: 580.65.06      CUDA Version: 13.0     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 5090        Off |   00000000:05:00.0  On |                  N/A |
|  0%   43C    P8             37W /  450W |     512MiB /  32607MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A            2665      G   /usr/lib/xorg/Xorg                      153MiB |
|    0   N/A  N/A            2893      G   /usr/bin/gnome-shell                     41MiB |
|    0   N/A  N/A           10518      G   /usr/bin/nautilus                        19MiB |
|    0   N/A  N/A           11589      G   .../6836/usr/lib/firefox/firefox        213MiB |
|    0   N/A  N/A           14389      G   ...6798/usr/bin/telegram-desktop          6MiB |
+-----------------------------------------------------------------------------------------+
$ dpkg -l | grep cudnn
ii  cudnn9-cuda-13                                  9.13.0.50-1                              amd64        NVIDIA cuDNN for CUDA 13
ii  cudnn9-cuda-13-0                                9.13.0.50-1                              amd64        NVIDIA cuDNN for CUDA 13.0
ii  libcudnn9-cuda-13                               9.13.0.50-1                              amd64        cuDNN runtime libraries for CUDA 13.0
ii  libcudnn9-dev-cuda-13                           9.13.0.50-1                              amd64        cuDNN development libraries for CUDA 13.0
ii  libcudnn9-headers-cuda-13                       9.13.0.50-1                              amd64        cuDNN header files for CUDA 13.0
ii  libcudnn9-static-cuda-13                        9.13.0.50-1                              amd64        cuDNN static libraries for CUDA 13.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions