Question about details of data preprocessing.

Hi!  Thanks for releasing the code!

I have trained the model with the features you provided and achieves similar performance.  And now I‘m trying to train a model using data precessed myself. I download ImageNet-VID datasets and extract optical flow using OpenCV TVL1 algorithm implemented by [MMAction ](https://github.com/open-mmlab/mmaction/blob/master/DATASET.md). The Optical Flow is calculated on the original size of video frames without resize. Then, I crop the region proposal according to the [tubes you provided](https://drive.google.com/file/d/1SHwXtlb7V8PH4_60-0-VZYL-7kXEG_Wj/view?usp=sharing) and resize them to 224*224. Finally, follow [this](https://github.com/deepmind/kinetics-i3d), I rescale the value of RGB and Flow images between -1 and 1 (by x/127.5-1) and feed them to the official i3d model. 

However, the features I get are somewhat different from that I download. The cosine similarity of i3d-RGB feature is about 0.95 and i3d-Flow feature is about 0.8. The similarity is not too low, but there are still some gaps, especially the flow feature. Is there anything wrong with the process I mentioned above? 

Here are some question.
1. After resizing the flow image to 224*224, does its value need to be scaled?
2. How to crop and normalize the images？ I use PIL to read images and bilinear interpolation to resize images.

Thanks a lot!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about details of data preprocessing. #3

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Question about details of data preprocessing. #3

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions