Normalize CV-CUDA Backend #9279

justincdavis · 2025-11-19T21:47:40Z

Summary

This PR adds the CV-CUDA backend kernel for the Normalize transform.

How to use

import cvcuda
import torchvision.transforms.v2.functional as F

cvc_tensor = cvcuda.Tensor((1, 224, 224, 3), cvcuda.Type.F32, cvcuda.TensorLayout.NHWC)
# Dispatches to F.normalize_cvcuda
normalized_tensor = F.normalize(cvc_tensor, [0.485, 0.456, 0.406], [0.229, 0.224, 0.225])

Run unit tests

pytest test/test_transforms_v2.py::TestNormalizeCVCUDA
...
60 passed in 0.59s

pytorch-bot · 2025-11-19T21:47:43Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/9279

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 974ffca with merge base 1e53952 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

meta-cla · 2025-11-19T21:47:46Z

Hi @justincdavis!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

AntoineSimoulin

Hey @justincdavis, thanks for submitting the PR, this is looking good:) I left some minor changes. I think we mainly need to make sure the tests are passing when cvcuda is not installed!

AntoineSimoulin · 2025-11-24T14:41:33Z

test/test_transforms_v2.py

            (F.normalize_video, tv_tensors.Video),
+            pytest.param(
+                F._misc._normalize_cvcuda,
+                _import_cvcuda().Tensor,


@justincdavis it seems that _import_cvcuda().Tensor is still raising an error if cvcuda is not installed. Maybe we can just use cvcuda.Tensor here and see if this works better?

Thank you for pointing this out! I replaced the actual cvcuda.Tensor type with the string "cvcuda.Tensor", then inside the function we resolve the cvcuda.Tensor type if we have the corresponding string. LMK if this looks like a reasonable solution!

torchvision/transforms/v2/functional/_misc.py

test/test_transforms_v2.py

justincdavis · 2025-11-24T17:21:55Z

Following up from my comment in the _normalize_cvcuda function itself. CV-CUDA requires that the mean and scale tensors be on-device when we call cvcuda.normalize. This means that a host->device memcpy must occur twice for each normalize call when using CV-CUDA backend. We could attempt to reduce the impact of this by having a helper function which creates the tuple[cvcuda.Tensor, cvcuda.Tensor] from the mean/std parameters. Based on what I see in the codebase, this seems like it would be a new feature present in torchvision for a functional transform.

# CV-CUDA requires float32 tensors for the mean/std parameters
# at small batchs, this is costly relative to normalize operation
# if CV-CUDA is known to be a backend, could optimize this
# For Normalize class:
# by creating tensors at class initialization time
# For functional API:
# by storing cached tensors in helper function with functools.lru_cache (would it even be worth it?)
# Since CV-CUDA is 1) not default backend, 2) only strictly faster at large batch size, ignore

… setup

AntoineSimoulin · 2025-11-26T15:50:00Z

Hey @justincdavis, looking good to me. I don't think the failing test is related to this PR. Seems like a false positive alert to me! Can you sign our Contributor License Agreement (c.f. meta-cla bot comment in the discussion)?

…ation

zy1git

I left some comments. Some of them are referred to the merged PRs in the past three weeks.

test/test_transforms_v2.py

torchvision/transforms/v2/functional/_misc.py

NicolasHug

Thanks a lot for the PR @justincdavis , I left a review for @zy1git to address.

torchvision/transforms/v2/functional/_misc.py

NicolasHug · 2026-01-08T13:10:49Z

test/test_transforms_v2.py

+        if is_cvcuda:
+            assert_close(actual, expected, rtol=0, atol=1e-6)
+        else:
+            assert_equal(actual, expected)


I'm surprised atol=1e-6 is needed, I thought it was the default for float32. Let's try without it and see if the CI is happy?

I remove the rtol = 0, atol=1e-6 and the test passed. However, the default for assert_close is rtol = 1.3e-6, atol=1e-5, thus if the original implementation rtol=0, atol=1e-6 can pass the test, the default version definitely can pass.
I think atol = 1e-6 is more restricted threshold. I will change it to default in a new commit. Please let me know if you think we need to use the original implementation.

NicolasHug · 2026-01-08T13:11:59Z

test/test_transforms_v2.py

+        if is_cvcuda:
+            image = F.cvcuda_to_tensor(image)[0].cpu()
+
        expected = self._reference_normalize_image(image, mean=mean, std=std)


Note for self: double-checking that this is doing the right conversion. image is not a tensor and we're using _reference_normalize_image as the ref, we'll be comparing our cvcuda-tensor to this reference tensor. Seems OK.

NicolasHug · 2026-01-08T13:12:19Z

test/test_transforms_v2.py

+        if is_cvcuda and dtype != torch.float32:
+            pytest.skip("CVCUDA only supports float32 for normalize")


Instead of a skip, this could be an xfail instead. See https://docs.pytest.org/en/stable/how-to/skipping.html

I changed pytest.skip to pytest.xfail

NicolasHug · 2026-01-08T13:12:48Z

test/test_transforms_v2.py

Note for self: need to check that all the relevant tests have been properly parametrized.

NicolasHug · 2026-01-08T16:13:43Z

torchvision/transforms/v2/functional/_misc.py

+    # torchvision only supports uint and float, right now CV-CUDA doesnt expose float16, so only check 32
+    # in the future add float16 once exposed in CV-CUDA
+    if not (image.dtype == cvcuda.Type.F32):
+        raise ValueError(f"Input tensor should be a float tensor. Got {image.dtype}.")


We should have a test that asserts non-float32 leads to an error and check the error message

Added one test for that: https://github.com/pytorch/vision/pull/9279/changes#diff-9c2dde92db86c123fee225e39b7c1ef96e08a3e79a9dcc9a2d68b21ed51a81d0R5606

zy1git

I addressed comments by pushing a new commit. Feel free to take a look.

zy1git · 2026-01-11T00:21:12Z

test/test_transforms_v2.py

+        if is_cvcuda:
+            assert_close(actual, expected, rtol=0, atol=1e-6)
+        else:
+            assert_equal(actual, expected)


I remove the rtol = 0, atol=1e-6 and the test passed. However, the default for assert_close is rtol = 1.3e-6, atol=1e-5, thus if the original implementation rtol=0, atol=1e-6 can pass the test, the default version definitely can pass.
I think atol = 1e-6 is more restricted threshold. I will change it to default in a new commit. Please let me know if you think we need to use the original implementation.

zy1git · 2026-01-11T01:33:05Z

torchvision/transforms/v2/functional/_misc.py

+    # torchvision only supports uint and float, right now CV-CUDA doesnt expose float16, so only check 32
+    # in the future add float16 once exposed in CV-CUDA
+    if not (image.dtype == cvcuda.Type.F32):
+        raise ValueError(f"Input tensor should be a float tensor. Got {image.dtype}.")


Added one test for that: https://github.com/pytorch/vision/pull/9279/changes#diff-9c2dde92db86c123fee225e39b7c1ef96e08a3e79a9dcc9a2d68b21ed51a81d0R5606

zy1git · 2026-01-11T01:34:31Z

test/test_transforms_v2.py

+        if is_cvcuda and dtype != torch.float32:
+            pytest.skip("CVCUDA only supports float32 for normalize")


I changed pytest.skip to pytest.xfail

AntoineSimoulin reviewed Nov 24, 2025

View reviewed changes

justincdavis added 2 commits November 25, 2025 09:14

implement additional cvcuda infra for all branches to avoid duplicate…

44db71c

… setup

update make_image_cvcuda to have default batch dim

e3dd700

meta-cla bot added the cla signed label Dec 2, 2025

justincdavis added 5 commits December 1, 2025 18:16

add stanardized setup to main for easier updating of PRs and branches

c035df1

update is_cvcuda_tensor

98d7dfb

add cvcuda to pil compatible to transforms by default

ddc116d

remove cvcuda from transform class

e51dc7e

merge with main

e14e210

justincdavis force-pushed the feat/normalize_cvcuda branch from 778ad32 to d3ef0bd Compare December 4, 2025 19:03

justincdavis added 11 commits December 4, 2025 11:07

resolve more formatting naming

4939355

initial cvcuda normalize kernel implementation

1e864d8

add comment explaining mean/std behavior, one-line intermediate creation

01efae7

fix: normalize_cvcuda move to correct patterns for tests/exporting

79ea0da

fix tests crashing before run without cvcuda

429f77f

resolve more review comments

8ed3b26

remove extra parameterize for dtype

57ca083

simplify normalize testing into single test parameterize on input cre…

184e379

…ation

update normalize based on PR reviews

995834a

update normalize with changes from main

7105358

remove extra cvcuda_available add

0f8910e

justincdavis force-pushed the feat/normalize_cvcuda branch from 6b7dd65 to 0f8910e Compare December 4, 2025 19:07

justincdavis and others added 4 commits December 4, 2025 13:43

check input type on kernel for signature test

969dd3f

Merge branch 'main' into feat/normalize_cvcuda

cd3dd37

minimize diff, use is not ==, drop file level import for cvcuda

45296e4

remove commented out blocks

ecb6d40

zy1git reviewed Dec 18, 2025

View reviewed changes

justincdavis added 3 commits December 18, 2025 14:42

Merge upstream/main into feat/normalize_cvcuda

4e63225

update to new mark needs_cvcuda

c7c49f9

condense check logic to variable

aa62a8a

NicolasHug reviewed Jan 8, 2026

View reviewed changes

zy1git added 2 commits January 10, 2026 17:46

address the comments for the normalize PR

c3e88d8

Merge branch 'main' into feat/normalize_cvcuda

974ffca

zy1git reviewed Jan 11, 2026

View reviewed changes

		if is_cvcuda and dtype != torch.float32:
		pytest.skip("CVCUDA only supports float32 for normalize")

Normalize CV-CUDA Backend #9279

Are you sure you want to change the base?

Normalize CV-CUDA Backend #9279

Conversation

justincdavis commented Nov 19, 2025

Summary

How to use

Run unit tests

Uh oh!

pytorch-bot bot commented Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/9279

✅ No Failures

Uh oh!

meta-cla bot commented Nov 19, 2025

Action Required

Process

Uh oh!

AntoineSimoulin left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

justincdavis commented Nov 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AntoineSimoulin commented Nov 26, 2025

Uh oh!

zy1git left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

NicolasHug left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zy1git left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

pytorch-bot bot commented Nov 19, 2025 •

edited

Loading

justincdavis commented Nov 24, 2025 •

edited

Loading