Skip to content

Question about details of data preprocessing. #3

@rogerbao

Description

@rogerbao

Hi! Thanks for releasing the code!

I have trained the model with the features you provided and achieves similar performance. And now I‘m trying to train a model using data precessed myself. I download ImageNet-VID datasets and extract optical flow using OpenCV TVL1 algorithm implemented by MMAction . The Optical Flow is calculated on the original size of video frames without resize. Then, I crop the region proposal according to the tubes you provided and resize them to 224*224. Finally, follow this, I rescale the value of RGB and Flow images between -1 and 1 (by x/127.5-1) and feed them to the official i3d model.

However, the features I get are somewhat different from that I download. The cosine similarity of i3d-RGB feature is about 0.95 and i3d-Flow feature is about 0.8. The similarity is not too low, but there are still some gaps, especially the flow feature. Is there anything wrong with the process I mentioned above?

Here are some question.

  1. After resizing the flow image to 224*224, does its value need to be scaled?
  2. How to crop and normalize the images? I use PIL to read images and bilinear interpolation to resize images.

Thanks a lot!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions