Skip to content

datake/SinkhornDistRL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Implementation of 'Distributional Reinforcement Learning with Regularized Wasserstein Loss' (NeurIPS 2024)

This repository contains the Pytorch implementation of our Sinkhorn Distributional RL paper "Distributional Reinforcement Learning with Regularized Wasserstein Loss". We include all distributional RL algorithms considered in our paper, including DQN, C51, QR-DQN, MMDDRL, and SinkhornDRL(ours).

Architecture Overview

Run the Code

Step 1: Environment Setup: OpenAI Baselines, requirement.txt, and Atari Games

Install OpenAI Baselines, and then install dependency in requirement.txt.

git clone https://github.com/openai/baselines.git
cd baselines
pip install -e .

Step 2: Run Distributional RL algorithms on 55 Atari Games

We take the Breakout environment as an example. Run the following code respectively and then plot the learning curves. (Note: --multi 0 by default)

python main.py --game breakout --method DQN --iter 10000000 --seed 1 --gpu 1
python main.py --game breakout --method C51 --iter 10000000 --seed 1 --gpu 1
python main.py --game breakout --method QRDQN --iter 10000000 --seed 1 --gpu 1
python main.py --game breakout --method MMD --iter 10000000 --seed 1 --gpu 1
python main.py --game breakout --method Sinkhorn --iter 10000000 --niter_sink 10 --epsilon 10 --samples 200 --seed 1 --gpu 1

Step 3: Run Distributional RL algorithms on five Atari Games in the Multi-dimensional Setting

python main.py --game Asteroids --method MMD --iter 10000000 --multi 1 --seed 1 --gpu 0
python main.py --game Asteroids --method Sinkhorn --iter 10000000 --multi 1 --seed 1 --gpu 1

We consider five Atari games with multi-dimensional rewards: AirRaid, Asteroids, Gopher, MsPacman, UpNDown, and Pong. For the reward decomposition method, please refer to the file reward-compose.

Acknowledgement

This implementation is adapted from ShangtongZhang's Modularized Implementation of Deep RL Algorithms. Our implementation in the multi-dimensional setting is adapted from the original implementation of the paper Distributional Reinforcement Learning for Multi-Dimensional Reward Functions (NeurIPS 2021).

Contact

Please contact ksun6@ualberta.ca if you have any questions.

Reference

Please cite our paper if you use our implementation in your research:

@inproceedings{sun2024distributional,
  title={Distributional Reinforcement Learning with Regularized Wasserstein Loss},
  author={Sun, Ke and Zhao, Yingnan and Liu, Wulong and Jiang, Bei and Kong, Linglong},
  booktitle={Advances in Neural Information Processing Systems},
  year={2024}
}

About

Implementation of 'Distributional Reinforcement Learning with Regularized Wasserstein Loss' (NeurIPS 2024)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages