A tabular benchmark for hyperparameter optimization in reinforcement learning across Atari, MuJoCo, and classic control tasks.
All benchmark data is hosted on Hugging Face Datasets:
👉 https://huggingface.co/datasets/gresashala/HPO-RL-Bench-data
The repo contains the data_hpo_rl_bench/ folder used by this codebase.
# Install Git LFS once (if not already)
git lfs install
# Clone the dataset repo
git clone https://huggingface.co/datasets/gresashala/HPO-RL-Bench-data
# (Optional) Pull/checkout LFS objects explicitly if you ever see pointer files
cd HPO-RL-Bench-data
git lfs pull
git lfs checkout
cd ..Now place/keep HPO-RL-Bench-data/ next to this code repository and pass data_path to the handler—see below.
Directory layout (excerpt):
HPO-RL-Bench/                    # this code repo
HPO-RL-Bench-data/               # cloned from HF
└─ data_hpo_rl_bench/
   ├─ PPO/
   │  ├─ Pong-v0_0/              # seed shards by range
   │  ├─ Pong-v0_1/
   │  └─ Pong-v0_2/
   ├─ DQN/
   │  └─ ...
   └─ SAC/
      ├─ Hopper-v2_1/
      │  ├─ shard-0000/          # optional inner sharding to keep <10k files/dir
      │  └─ shard-0001/
      └─ ...
ENVIRONMENT_0→ seeds 0–2ENVIRONMENT_1→ seeds 3–5ENVIRONMENT_2→ seeds 6–9- Some heavy dirs (e.g., SAC Hopper/Humanoid) also contain 
shard-0000/,shard-0001/, … subfolders. 
If you just clone normally and see small text files starting with
version https://git-lfs.github.com/spec/v1, run:git lfs pull git lfs checkout
If you only need a specific file:
from huggingface_hub import hf_hub_download
p = hf_hub_download(
    repo_id="gresashala/HPO-RL-Bench-data",
    repo_type="dataset",
    filename="data_hpo_rl_bench/PPO/Pong-v0_0/Pong-v0_PPO_random_lr_-2_gamma_1.0_clip_0.2_seed4_eval.json",
)
print(p)  # local path to the downloaded fileconda create -n hpo_rl_bench python=3.9 -y
conda activate hpo_rl_bench
conda install swig -y
pip install -r requirements.txtWindows note (for pyrfr): Microsoft Visual C++ 14.0+ is required. Install via Microsoft C++ Build Tools.
If HPO-RL-Bench-data/ sits next to this code repo, point data_path to the folder that contains data_hpo_rl_bench/ (e.g., "../HPO-RL-Bench-data").
from benchmark_handler import BenchmarkHandler
# Example: static PPO on Atari Enduro, seed 0
benchmark = BenchmarkHandler(
    data_path="../HPO-RL-Bench-data",  # parent of `data_hpo_rl_bench/` (adjust if needed)
    environment="Enduro-v0",
    seed=0,
    search_space="PPO",
    set="static",
)
# Query a static configuration
configuration_to_query = {"lr": -6, "gamma": 0.8, "clip": 0.2}
queried = benchmark.get_metrics(configuration_to_query, budget=50)
# Query a dynamic configuration (multi-lr/gamma)
benchmark.set = "dynamic"
configuration_to_query = {"lr": [-3, -4], "gamma": [0.98, 0.99], "clip": [0.2, 0.2]}
queried_dyn = benchmark.get_metrics(configuration_to_query, budget=50)See benchmark-usages-examples.ipynb for more examples, including extended sets and Bayesian optimization loops.
- Figure 4: 
plot_static_ppo.py - Figure 5a: 
plot_dynamic.py - Figure 5b: 
plot_extended.py - Figure 6a: 
cd_diagram.py - Figure 6b: in 
cd_diagram.py, setALGORITHM="A2C"(line ~384) and run - Figures 2 & 3: 
benchmark_EDA.py 
If you use HPO-RL-Bench in your work, please cite:
@inproceedings{shala2024hporlbench,
  title     = {{HPO-RL-Bench}: A Zero-Cost Benchmark for Hyperparameter Optimization in Reinforcement Learning},
  author    = {Gresa Shala and Sebastian Pineda Arango and Andr{\'e} Biedenkapp and Frank Hutter and Josif Grabocka},
  booktitle = {Proceedings of the AutoML Conference 2024 (ABCD Track)},
  year      = {2024},
  url       = {https://openreview.net/forum?id=MlB61zPAeR}
}