Skip to content

Lower Top 1 and Top 5 Accuracy Score Than Benchmark for SlowFast R50 Model #272

@NicolasRiley

Description

@NicolasRiley

🐛 Bugs / Unexpected behaviors

Hi! I’m running evaluation using the pretrained SlowFast R50 (8x8) model on the Kinetics-400 dataset and am observing ~10% lower accuracy than reported benchmarks (both Top-1 and Top-5). I’m wondering if I might have missed a detail in setup or evaluation, and would greatly appreciate any guidance!

Instructions To Reproduce the Issue:

  1. I used the model and transforms from the official PyTorchVideo example to run inference on more than 30,000 videos in Kinetics-400. I used GPU (device = "cuda") and wrote a script called 'multi_input.py' that loads filenames from a CSV, runs inference with predict_video(), and writes predictions to a CSV file.

Here’s a simplified version of that code:

import os
import csv
from model import predict_video

def run_inference(video_dir, input_csv, output_csv):
    # 1) Read all the filenames
    with open(input_csv, newline="") as f:
        reader = csv.reader(f)
        filenames = [row[0] for row in reader]

    # 2) Run prediction and write results
    with open(output_csv, "w", newline="") as out_f:
        writer = csv.writer(out_f)
        writer.writerow(["filename", "Output"])

        for fname in filenames:
            full_path = os.path.join(video_dir, fname)
            try:
                preds = predict_video(full_path)
                writer.writerow([fname, ", ".join(preds)])
            except Exception as e:
                writer.writerow([fname, f"ERROR: {e}"])
  1. After running:

python multi_input.py # Inference on 30,000 Kinetics-400 test videos
python analysis.py # Custom script to compute top-1 and top-5 accuracy

  1. I observed the following:
Analyzing outputs.csv...

Total files processed: 38685
Top-1 accuracy: 65.47%
Top-5 accuracy: 84.93%
Missing files: 0
Mistakes: 5830

Mistakes saved to mistakes.csv

Could the discrepancy be due to subtle preprocessing differences or evaluation protocol mismatches? Any tips would be very helpful. Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions