Make loading features from storage robust to order.#9
Conversation
The current `load_features` implementation relies on features from each video (same video_id) being in a contiguous block. This matches how `store_features` organizes feature files. Update `load_features` to accept descriptors in any order by sorting by video_id (then by start timestamp) before constructing `VideoFeature` structures. Also change `store_features` to sort by video_id before storing features.
| restored = load_features(f.name) | ||
|
|
||
| features.sort(key=lambda x: x.video_id) | ||
| restored.sort(key=lambda x: x.video_id) |
There was a problem hiding this comment.
Should we be testing that restored is already properly sorted when loading with load_features? I'm not sure we should sort it here.
|
For the sake of completeness, we also tracked down the reason we believe the memory error was caused. Line 60 in 5d8af86 In vsc2022/vsc/descriptor_eval_lib.py Line 39 in 3afe07a The resulting calculated number of query candidates to generate for a given input query descriptor is then more than an order of magnitude larger than we intend. When we exhaustively search for and return this number of candidates in our exponential iterator, we return increasingly large copies of matrices until we run out of memory. |
|
Hi @edpizzi! Thank you for your pull request. We require contributors to sign our Contributor License Agreement, and yours needs attention. You currently have a record in our system, but the CLA is no longer valid, and will need to be resubmitted. ProcessIn order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA. Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks! |
The current
load_featuresimplementation relies on features from each video (same video_id) being in a contiguous block. This matches howstore_featuresorganizes feature files.Update
load_featuresto accept descriptors in any order by sorting by video_id (then by start timestamp) before constructingVideoFeaturestructures. Also changestore_featuresto sort by video_id before storing features.