Skip to content

GPU inference error in ai.djl.examples.inference.clip.ImageTextComparison #3810

@geekwenjie

Description

@geekwenjie

In the example ai.djl.examples.inference.clip.ImageTextComparison, running on GPU causes an error.
Is this because tokenizers do not support GPU? The GPU error log is shown below:
`
Exception in thread "main" ai.djl.translate.TranslateException: ai.djl.engine.EngineException: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
File "code/torch/transformers/models/clip/modeling_clip.py", line 35, in forward
_2 = (vision_model).forward1(pixel_values, )
_3, _4, = _2
_5 = (text_model).forward1(input_ids, attention_mask, )
~~~~~~~~~~~~~~~~~~~~ <--- HERE
_6, _7, = _5
_8 = (visual_projection).forward1(_3, )
File "code/torch/transformers/models/clip/modeling_clip.py", line 119, in forward1
inverted_mask = torch.rsub(torch.to(_41, 6), 1.)
attention_mask1 = torch.masked_fill(inverted_mask, torch.to(inverted_mask, 11), -3.4028234663852886e+38)
_42 = (encoder).forward1(_33, causal_attention_mask, attention_mask1, )
~~~~~~~~~~~~~~~~~ <--- HERE
_43 = (final_layer_norm).forward1(_42, )
_44 = ops.prim.NumToTensor(torch.size(_43, 0))
File "code/torch/transformers/models/clip/modeling_clip.py", line 234, in forward1
layers21 = self.layers
_0 = getattr(layers21, "0")
_68 = (_0).forward1(argument_1, causal_attention_mask, attention_mask, )
~~~~~~~~~~~~ <--- HERE
_69 = (_1).forward1(_68, causal_attention_mask, attention_mask, )
_70 = (_2).forward1(_69, causal_attention_mask, attention_mask, )
File "code/torch/transformers/models/clip/modeling_clip.py", line 277, in forward1
layer_norm1 = self.layer_norm1
_82 = (layer_norm1).forward1(argument_1, )
_83 = (self_attn).forward1(_82, causal_attention_mask, attention_mask, )
~~~~~~~~~~~~~~~~~~~ <--- HERE
input = torch.add(argument_1, _83)
_84 = (mlp).forward1((layer_norm2).forward1(input, ), )
File "code/torch/transformers/models/clip/modeling_clip.py", line 398, in forward1
attn_weights = torch.bmm(query_states, torch.transpose(key_states1, 1, 2))
_143 = torch.view(attn_weights, [_119, 8, _128, _142])
attn_weights4 = torch.add(_143, causal_attention_mask)
~~~~~~~~~ <--- HERE
_144 = [int(torch.mul(bsz, CONSTANTS.c2)), _127, _141]
attn_weights5 = torch.view(attn_weights4, _144)

Traceback of TorchScript, original code (most recent call last):
/Users/qingla/PycharmProjects/pt/venv/lib/python3.9/site-packages/transformers/models/clip/modeling_clip.py(293): forward
/Users/qingla/PycharmProjects/pt/venv/lib/python3.9/site-packages/torch/nn/modules/module.py(1118): _slow_forward
/Users/qingla/PycharmProjects/pt/venv/lib/python3.9/site-packages/torch/nn/modules/module.py(1130): _call_impl
/Users/qingla/PycharmProjects/pt/venv/lib/python3.9/site-packages/transformers/models/clip/modeling_clip.py(379): forward
/Users/qingla/PycharmProjects/pt/venv/lib/python3.9/site-packages/torch/nn/modules/module.py(1118): _slow_forward
/Users/qingla/PycharmProjects/pt/venv/lib/python3.9/site-packages/torch/nn/modules/module.py(1130): _call_impl
/Users/qingla/PycharmProjects/pt/venv/lib/python3.9/site-packages/transformers/models/clip/modeling_clip.py(650): forward
/Users/qingla/PycharmProjects/pt/venv/lib/python3.9/site-packages/torch/nn/modules/module.py(1118): _slow_forward
/Users/qingla/PycharmProjects/pt/venv/lib/python3.9/site-packages/torch/nn/modules/module.py(1130): _call_impl
/Users/qingla/PycharmProjects/pt/venv/lib/python3.9/site-packages/transformers/models/clip/modeling_clip.py(721): forward
/Users/qingla/PycharmProjects/pt/venv/lib/python3.9/site-packages/torch/nn/modules/module.py(1118): _slow_forward
/Users/qingla/PycharmProjects/pt/venv/lib/python3.9/site-packages/torch/nn/modules/module.py(1130): _call_impl
/Users/qingla/PycharmProjects/pt/venv/lib/python3.9/site-packages/transformers/models/clip/modeling_clip.py(1120): forward
/Users/qingla/PycharmProjects/pt/venv/lib/python3.9/site-packages/torch/nn/modules/module.py(1118): _slow_forward
/Users/qingla/PycharmProjects/pt/venv/lib/python3.9/site-packages/torch/nn/modules/module.py(1130): _call_impl
/Users/qingla/PycharmProjects/pt/venv/lib/python3.9/site-packages/torch/jit/_trace.py(967): trace_module
/Users/qingla/PycharmProjects/pt/testclip.py(20):
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

at ai.djl.inference.Predictor.batchPredict(Predictor.java:200)
at ai.djl.inference.Predictor.predict(Predictor.java:133)
at smartai.examples.vision.djl.ClipModel.compareTextAndImage(ClipModel.java:68)
at smartai.examples.vision.djl.ImageTextComparison.compareTextAndImage(ImageTextComparison.java:46)
at smartai.examples.vision.djl.ImageTextComparison.main(ImageTextComparison.java:36)

Caused by: ai.djl.engine.EngineException: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
File "code/torch/transformers/models/clip/modeling_clip.py", line 35, in forward
_2 = (vision_model).forward1(pixel_values, )
_3, _4, = _2
_5 = (text_model).forward1(input_ids, attention_mask, )
~~~~~~~~~~~~~~~~~~~~ <--- HERE
_6, _7, = _5
_8 = (visual_projection).forward1(_3, )
File "code/torch/transformers/models/clip/modeling_clip.py", line 119, in forward1
inverted_mask = torch.rsub(torch.to(_41, 6), 1.)
attention_mask1 = torch.masked_fill(inverted_mask, torch.to(inverted_mask, 11), -3.4028234663852886e+38)
_42 = (encoder).forward1(_33, causal_attention_mask, attention_mask1, )
~~~~~~~~~~~~~~~~~ <--- HERE
_43 = (final_layer_norm).forward1(_42, )
_44 = ops.prim.NumToTensor(torch.size(_43, 0))
File "code/torch/transformers/models/clip/modeling_clip.py", line 234, in forward1
layers21 = self.layers
_0 = getattr(layers21, "0")
_68 = (_0).forward1(argument_1, causal_attention_mask, attention_mask, )
~~~~~~~~~~~~ <--- HERE
_69 = (_1).forward1(_68, causal_attention_mask, attention_mask, )
_70 = (_2).forward1(_69, causal_attention_mask, attention_mask, )
File "code/torch/transformers/models/clip/modeling_clip.py", line 277, in forward1
layer_norm1 = self.layer_norm1
_82 = (layer_norm1).forward1(argument_1, )
_83 = (self_attn).forward1(_82, causal_attention_mask, attention_mask, )
~~~~~~~~~~~~~~~~~~~ <--- HERE
input = torch.add(argument_1, _83)
_84 = (mlp).forward1((layer_norm2).forward1(input, ), )
File "code/torch/transformers/models/clip/modeling_clip.py", line 398, in forward1
attn_weights = torch.bmm(query_states, torch.transpose(key_states1, 1, 2))
_143 = torch.view(attn_weights, [_119, 8, _128, _142])
attn_weights4 = torch.add(_143, causal_attention_mask)
~~~~~~~~~ <--- HERE
_144 = [int(torch.mul(bsz, CONSTANTS.c2)), _127, _141]
attn_weights5 = torch.view(attn_weights4, _144)

Traceback of TorchScript, original code (most recent call last):
/Users/qingla/PycharmProjects/pt/venv/lib/python3.9/site-packages/transformers/models/clip/modeling_clip.py(293): forward
/Users/qingla/PycharmProjects/pt/venv/lib/python3.9/site-packages/torch/nn/modules/module.py(1118): _slow_forward
/Users/qingla/PycharmProjects/pt/venv/lib/python3.9/site-packages/torch/nn/modules/module.py(1130): _call_impl
/Users/qingla/PycharmProjects/pt/venv/lib/python3.9/site-packages/transformers/models/clip/modeling_clip.py(379): forward
/Users/qingla/PycharmProjects/pt/venv/lib/python3.9/site-packages/torch/nn/modules/module.py(1118): _slow_forward
/Users/qingla/PycharmProjects/pt/venv/lib/python3.9/site-packages/torch/nn/modules/module.py(1130): _call_impl
/Users/qingla/PycharmProjects/pt/venv/lib/python3.9/site-packages/transformers/models/clip/modeling_clip.py(650): forward
/Users/qingla/PycharmProjects/pt/venv/lib/python3.9/site-packages/torch/nn/modules/module.py(1118): _slow_forward
/Users/qingla/PycharmProjects/pt/venv/lib/python3.9/site-packages/torch/nn/modules/module.py(1130): _call_impl
/Users/qingla/PycharmProjects/pt/venv/lib/python3.9/site-packages/transformers/models/clip/modeling_clip.py(721): forward
/Users/qingla/PycharmProjects/pt/venv/lib/python3.9/site-packages/torch/nn/modules/module.py(1118): _slow_forward
/Users/qingla/PycharmProjects/pt/venv/lib/python3.9/site-packages/torch/nn/modules/module.py(1130): _call_impl
/Users/qingla/PycharmProjects/pt/venv/lib/python3.9/site-packages/transformers/models/clip/modeling_clip.py(1120): forward
/Users/qingla/PycharmProjects/pt/venv/lib/python3.9/site-packages/torch/nn/modules/module.py(1118): _slow_forward
/Users/qingla/PycharmProjects/pt/venv/lib/python3.9/site-packages/torch/nn/modules/module.py(1130): _call_impl
/Users/qingla/PycharmProjects/pt/venv/lib/python3.9/site-packages/torch/jit/_trace.py(967): trace_module
/Users/qingla/PycharmProjects/pt/testclip.py(20):
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

at ai.djl.pytorch.jni.PyTorchLibrary.moduleRunMethod(Native Method)
at ai.djl.pytorch.jni.IValueUtils.forward(IValueUtils.java:57)
at ai.djl.pytorch.engine.PtSymbolBlock.forwardInternal(PtSymbolBlock.java:146)
at ai.djl.nn.AbstractBaseBlock.forward(AbstractBaseBlock.java:79)
at ai.djl.nn.Block.forward(Block.java:129)
at ai.djl.inference.Predictor.predictInternal(Predictor.java:150)
at ai.djl.inference.Predictor.batchPredict(Predictor.java:175)
... 4 more

`

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions