Skip to content

Source Code for ECCV 2024 Paper "Multimodal Label Relevance Ranking via Reinforcement Learning"

License

Notifications You must be signed in to change notification settings

ChazzyGordon/LR2PPO

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Multimodal Label Relevance Ranking via Reinforcement Learning (ECCV2024)

This is the official PyTorch implementation of LR2PPO. The ECCV2024 paper is available at arXiv.
Introduction video: YouTube

Getting Started

Data Preparation

For LRMovieNet Benchmark

For MSLR-Web10K → MQ2008 Transfer Task

Initialization Weights

Download required weights for both benchmarks:

  • roberta_base_en_model and vit_base_patch16_224_model
  • Source: from Google Drive or from its official repositories
  • Save in: ./pretrained_models/

Prerequisites

pip3 install -r requirements.txt

Hardware Requirement: 4 GPUs

Usage Instructions

For LRMovieNet Benchmark

# Stage 1: Base Model
sh pointwise.sh <your_stage1>

# Stage 2: Reward Model
sh reward_pair_dataloader.sh <your_stage2>

# Stage 3: LR<sup>2</sup>PPO
sh ppo.sh <your_stage3>

# Evaluation
sh ppo_eval.sh <your_eval>

For MSLR-Web10K → MQ2008 Transfer Task

# Stage 1: Base Model
sh pointwise_trad.sh <your_stage1>

# Stage 2: Reward Model
sh reward_trad.sh <your_stage2>

# Stage 3: LR<sup>2</sup>PPO
sh ppo_trad.sh <your_stage3>

# Evaluation
sh ppo_eval_trad.sh <your_eval>

Model Checkpoints

LRMovieNet Benchmark

MSLR-Web10K → MQ2008 Transfer

License

See LICENSE for details.

Acknowledgments

Code components borrowed from:

We are grateful for these excellent works and repositories.

Citation

If you found our work helpful in your research, please consider citing it.

@inproceedings{guo2024multimodal,
  title={Multimodal Label Relevance Ranking via Reinforcement Learning},
  author={Guo, Taian and Zhang, Taolin and Wu, Haoqian and Li, Hanjun and Qiao, Ruizhi and Sun, Xing},
  booktitle={European Conference on Computer Vision},
  pages={391--408},
  year={2024},
  organization={Springer}
}

About

Source Code for ECCV 2024 Paper "Multimodal Label Relevance Ranking via Reinforcement Learning"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published