Dear authors,
thank you for your awesome work in long-context vision LMMs. I tried evaluation with lmms-eval. I set the max_num_frames=64 and get CUDA OOM error. The code is as following:
python3 -m accelerate.commands.launch --num_processes=8 -m lmms_eval --model idefics2 --tasks longvideobench_val_i --batch_size 1 --log_samples --log_samples_suffix idefics2_lvb_i --output_path ./logs/
It seems that the inference is running in parallel on 8 GPUs. In paper, you write that you use 2 GPUs for inference. Which command and script do you use?
Thank you!