-
Notifications
You must be signed in to change notification settings - Fork 218
Implementation of contrast() seems wrong #109
Copy link
Copy link
Open
Description
I have created #108 for demonstration purpose. In short: the mean here
big_vision/big_vision/pp/autoaugment.py
Lines 209 to 213 in 01edb81
| # Compute the grayscale histogram, then compute the mean pixel value, | |
| # and create a constant image size of that value. Use that as the | |
| # blending degenerate target of the original image. | |
| hist = tf.histogram_fixed_width(degenerate, [0, 255], nbins=256) | |
| mean = tf.reduce_sum(tf.cast(hist, tf.float32)) / 256.0 |
is supposed to be the mean pixel value, but as it is it's just summing over the histogram (therefore equal to height * width), divided by 256. For the standard
decode_jpeg_and_inception_crop(224), I have verified that mean is always 224 * 224 / 256 = 196. I have also created the following calibration grid to double-check the transform's behavior, with RGB values (192, 64, 64) for the reddish squares and (64, 192, 192) for the bluish squares:
As it is, contrast(tf_color_tile, 1.9) returns the following:

with RGB values (188, 0, 0) and (0, 188, 188). After the fix, contrast(tf_color_tile, 1.9) returns the following:

with RGB values (249, 6, 6) and (6, 249, 249), which is more in line with other implementations. E.g. the approximate torchvision equivalent
from torchvision.transforms.v2 import functional as F
F.adjust_contrast(torch_color_tile, contrast_factor=1.9)returns RGB values (250, 6, 6) and (6, 250, 250).
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels
