RL Experiments - SLURM Setup

This directory contains scripts to run reinforcement learning experiments in parallel on SLURM.

Initial Setup (One-time)

1. Create virtual environment and install packages:

cd rl_final_proj
./setup_venv.sh

This creates a venv/ directory with all required packages using uv.

Testing a Single Experiment

Before submitting all jobs, test with one experiment:

Activate the virtual environment and run:

cd rl_final_proj
source venv/bin/activate
python train_rl.py --task "Hopper-v5" --algorithm "SAC" --seed 0

This will:

Train one agent (Hopper-v5 with SAC, seed 0)
Save results to rl_experiments/runs/ and rl_experiments/models/
Append to rl_experiments/final_eval_returns.csv

For a quick test with fewer timesteps, you can temporarily modify config.yaml:

timesteps_per_task:
  "Hopper-v5": 10000  # Reduced from 1000000 for quick test

Setup

Configure experiments (optional): Edit config.yaml to customize tasks, algorithms, seeds, and other settings. You can also adjust slurm.max_concurrent_jobs to control parallelism.
Generate the job list:
```
python generate_joblist.py
```
This creates joblist.txt with all combinations of tasks, algorithms, and seeds from config.yaml.

Submit the SLURM job array (EASY WAY):

chmod +x prepare_submit.sh
./prepare_submit.sh

This automatically calculates the array size and submits with the correct parallelism settings.

OR manually:

# Check number of jobs
wc -l joblist.txt
# Edit submit_rl.sbatch to update --array=1-N%M where N=total jobs, M=concurrent
sbatch submit_rl.sbatch

Parallel Execution

YES, this runs in parallel! The SLURM job array (--array=1-N%M) means:

N = total number of jobs (one per task/algorithm/seed combination)
M = maximum concurrent jobs (default: 20)
Jobs run independently and in parallel

For example, with 100 jobs and %20:

20 jobs start immediately
As jobs finish, new ones start automatically
All 100 jobs will complete (just 20 at a time)

Adjust max_concurrent_jobs in config.yaml to change parallelism.

Configuration

All experiment settings are in config.yaml:

Tasks: List of environments to run
Algorithms: List of RL algorithms
Seeds: List of random seeds
Timesteps: Per-task training timesteps
Directories: Output paths
Algorithm settings: Algorithm-specific parameters

Default configuration:

Tasks: Hopper-v5, Walker2d-v5, HalfCheetah-v5, Ant-v5, Humanoid-v5
Algorithms: SAC, TD3, DDPG, PPO
Seeds: 0-4 (5 seeds)
Total jobs: 5 tasks × 4 algorithms × 5 seeds = 100 jobs

Output Structure

All results are saved in rl_experiments/:

runs/: Individual run JSON files ({task}_{algo}_seed{seed}.json)
models/: Trained model files ({task}_{algo}_seed{seed}.zip)
final_eval_returns.csv: Final evaluation results
learning_curves.csv: Learning curves during training

SLURM Settings

Time limit: 48 hours per job
Memory: 8GB per job
Concurrent jobs: 20 (adjust %20 in submit_rl.sbatch if needed)
Logs: logs/rl_{JOB_ID}_{ARRAY_ID}.out and .err

Notes

Jobs will skip if results already exist (cached runs)
Each job runs one (task, algorithm, seed) combination
Results are automatically appended to CSV files
Models are saved after training completes

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
additional_stats		additional_stats
power_experiments		power_experiments
rl_stats @ d86bc59		rl_stats @ d86bc59
stat_results		stat_results
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
check_results.py		check_results.py
config.yaml		config.yaml
final_analysis.ipynb		final_analysis.ipynb
generate_fpr.py		generate_fpr.py
generate_joblist.py		generate_joblist.py
generate_power.py		generate_power.py
requirements.txt		requirements.txt
setup_venv.sh		setup_venv.sh
submit_fpr_analysis.sbatch		submit_fpr_analysis.sbatch
submit_humanoid.sbatch		submit_humanoid.sbatch
submit_rl.sbatch		submit_rl.sbatch
train_rl.py		train_rl.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RL Experiments - SLURM Setup

Initial Setup (One-time)

Testing a Single Experiment

Setup

Parallel Execution

Configuration

Output Structure

SLURM Settings

Notes

About

Uh oh!

Releases

Packages

Languages

annabel-ma/statistical_testing_rl

Folders and files

Latest commit

History

Repository files navigation

RL Experiments - SLURM Setup

Initial Setup (One-time)

Testing a Single Experiment

Setup

Parallel Execution

Configuration

Output Structure

SLURM Settings

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages