-
Notifications
You must be signed in to change notification settings - Fork 431
Open
Description
🐛 Bugs / Unexpected behaviors
Hi! I’m running evaluation using the pretrained SlowFast R50 (8x8) model on the Kinetics-400 dataset and am observing ~10% lower accuracy than reported benchmarks (both Top-1 and Top-5). I’m wondering if I might have missed a detail in setup or evaluation, and would greatly appreciate any guidance!
Instructions To Reproduce the Issue:
- I used the model and transforms from the official PyTorchVideo example to run inference on more than 30,000 videos in Kinetics-400. I used GPU (device = "cuda") and wrote a script called 'multi_input.py' that loads filenames from a CSV, runs inference with predict_video(), and writes predictions to a CSV file.
Here’s a simplified version of that code:
import os
import csv
from model import predict_video
def run_inference(video_dir, input_csv, output_csv):
# 1) Read all the filenames
with open(input_csv, newline="") as f:
reader = csv.reader(f)
filenames = [row[0] for row in reader]
# 2) Run prediction and write results
with open(output_csv, "w", newline="") as out_f:
writer = csv.writer(out_f)
writer.writerow(["filename", "Output"])
for fname in filenames:
full_path = os.path.join(video_dir, fname)
try:
preds = predict_video(full_path)
writer.writerow([fname, ", ".join(preds)])
except Exception as e:
writer.writerow([fname, f"ERROR: {e}"])
- After running:
python multi_input.py # Inference on 30,000 Kinetics-400 test videos
python analysis.py # Custom script to compute top-1 and top-5 accuracy
- I observed the following:
Analyzing outputs.csv...
Total files processed: 38685
Top-1 accuracy: 65.47%
Top-5 accuracy: 84.93%
Missing files: 0
Mistakes: 5830
Mistakes saved to mistakes.csv
Could the discrepancy be due to subtle preprocessing differences or evaluation protocol mismatches? Any tips would be very helpful. Thank you!
Metadata
Metadata
Assignees
Labels
No labels