Hi! Thanks for releasing the code!
I have trained the model with the features you provided and achieves similar performance. And now I‘m trying to train a model using data precessed myself. I download ImageNet-VID datasets and extract optical flow using OpenCV TVL1 algorithm implemented by MMAction . The Optical Flow is calculated on the original size of video frames without resize. Then, I crop the region proposal according to the tubes you provided and resize them to 224*224. Finally, follow this, I rescale the value of RGB and Flow images between -1 and 1 (by x/127.5-1) and feed them to the official i3d model.
However, the features I get are somewhat different from that I download. The cosine similarity of i3d-RGB feature is about 0.95 and i3d-Flow feature is about 0.8. The similarity is not too low, but there are still some gaps, especially the flow feature. Is there anything wrong with the process I mentioned above?
Here are some question.
- After resizing the flow image to 224*224, does its value need to be scaled?
- How to crop and normalize the images? I use PIL to read images and bilinear interpolation to resize images.
Thanks a lot!
Hi! Thanks for releasing the code!
I have trained the model with the features you provided and achieves similar performance. And now I‘m trying to train a model using data precessed myself. I download ImageNet-VID datasets and extract optical flow using OpenCV TVL1 algorithm implemented by MMAction . The Optical Flow is calculated on the original size of video frames without resize. Then, I crop the region proposal according to the tubes you provided and resize them to 224*224. Finally, follow this, I rescale the value of RGB and Flow images between -1 and 1 (by x/127.5-1) and feed them to the official i3d model.
However, the features I get are somewhat different from that I download. The cosine similarity of i3d-RGB feature is about 0.95 and i3d-Flow feature is about 0.8. The similarity is not too low, but there are still some gaps, especially the flow feature. Is there anything wrong with the process I mentioned above?
Here are some question.
Thanks a lot!