WSRL on robot

This repository contains the code for running WSRL on the Franka peg insertion task. For official WSRL (non-robot) code, please refer to the official WSRL repo This project is built on top of HIL-SERL. See their README below for more details.

Installation

See Installation and follow the steps to set up your environment

File Structure

The SERL, CalQL, and WSRL training files are in /examples.examples/experiments/peg_insertion/config.py contains our reset poses and training hyperparameters. See their code structure for more details.

Running WSRL on Franka Peg Insertion

Setup
```
cd examples
```
Train the reward classifier and collect 20 expert demos. For the reward classifier we define anything from a half-insert to a full insert as success and collect many near-inserts as failures for robustness.
```
python train_reward_classifier.py --exp_name peg_insertion
```
```
python record_demos.py --exp_name peg_insertion --successes_needed 20
```
Follow the franka_walkthrough RAM insertion steps for more info. Note that our exp_name is peg_insertion and our corresponding files are in examples/experiments/peg_insertion

Collecting Offline Data

Use SERL or HIL-SERL to collect a dataset of robot transitions. We find that giving ~10 interventions near the start of training works best for a dataset of ~20k transitions.

Note to include your expert demo path in experiments/peg_insertion/run_actor.sh first.

sh experiments/peg_insertion/run_actor.sh --checkpoint_path [logs/offline_data_path]--description [hil_serl_data_collection]--use_resnet_mlp

sh experiments/peg_insertion/run_learner.sh --checkpoint_path [logs/offline_data_path] --description [hil_serl_data_collection]--use_resnet_mlp

Training offline Calql

Run this script to train offline CalQL and save checkpoints to calql_checkpoint_path using your offline dataset from offline_data_path. We found that training for ~200k steps converges and achieves 13/20 performance on our evals.

bash experiments/peg_insertion/run_calql_pretrain.sh --calql_checkpoint_path [logs/calql_checkpoint_path] --data_path [logs/offline_data_path/buffer] --use_resnet_mlp

Use the --eval_n_trajs flag to evaluate your CalQL checkpoint.

bash experiments/peg_insertion/run_calql_pretrain.sh --calql_checkpoint_path [logs/calql_checkpoint_path] --eval_n_trajs 20 --use_resnet_mlp

Running WSRL online

Initialize WSRL with your pretrained CalQL checkpoint, and run the training script.

sh experiments/peg_insertion/run_wsrl_actor.sh --save_path [logs/wsrl_save_path] --description [wsrl_run]--use_resnet_mlp

sh experiments/peg_insertion/run_wsrl_learner.sh --save_path [logs/wsrl_save_path] --description [wsrl_run]--use_resnet_mlp

Evaluate your WSRL checkpoint

sh experiments/peg_insertion/run_wsrl_eval.sh --use_resnet_mlp

HIL-SERL: Precise and Dexterous Robotic Manipulation via Human-in-the-Loop Reinforcement Learning

Webpage: https://hil-serl.github.io/

HIL-SERL provides a set of libraries, env wrappers, and examples to train RL policies using a combination of demonstrations and human corrections to perform robotic manipulation tasks with near-perfect success rates. The following sections describe how to use HIL-SERL. We will illustrate the usage with examples.

🎬: HIL-SERL video

Table of Contents

HIL-SERL: Precise and Dexterous Robotic Manipulation via Human-in-the-Loop Reinforcement Learning
- Citation

Installation

Setup Conda Environment: create an environment with
```
conda create -n hilserl python=3.10
```

Install Jax as follows:

For CPU (not recommended):
```
pip install --upgrade "jax[cpu]"
```

For GPU:

pip install --upgrade "jax[cuda12_pip]==0.4.35" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html

For TPU

pip install --upgrade "jax[tpu]" -f https://storage.googleapis.com/jax-releases/libtpu_releases.html

See the Jax Github page for more details on installing Jax.

Install the serl_launcher

cd serl_launcher
pip install -e .
pip install -r requirements.txt

Install for serl_robot_infra Follow the README in serl_robot_infra for installation and basic robot operation instructions. This contains the instruction for installing the impendence-based serl_franka_controllers. After the installation, you should be able to run the robot server, interact with the gym franka_env (hardware).

Overview and Code Structure

HIL-SERL provides a set of common libraries for users to train RL policies for robotic manipulation tasks. The main structure of running the RL experiments involves having an actor node and a learner node, both of which interact with the robot gym environment. Both nodes run asynchronously, with data being sent from the actor to the learner node via the network using agentlace. The learner will periodically synchronize the policy with the actor. This design provides flexibility for parallel training and inference.

Table for code structure

Code Directory	Description
examples	Scripts for policy training, demonstration data collection, reward classifier training
serl_launcher	Main code for HIL-SERL
serl_launcher.agents	Agent Policies (e.g. SAC, BC)
serl_launcher.wrappers	Gym env wrappers
serl_launcher.data	Replay buffer and data store
serl_launcher.vision	Vision related models and utils
serl_robot_infra	Robot infra for running with real robots
serl_robot_infra.robot_servers	Flask server for sending commands to robot via ROS
serl_robot_infra.franka_env	Gym env for Franka robot

Run with Franka Arm

We provide a step-by-step guide to run RL policies with HIL-SERL on a Franka robot.

Check out the Run with Franka Arm

Citation

If you use this code for your research, please cite our paper:

@misc{luo2024hilserl,
      title={Precise and Dexterous Robotic Manipulation via Human-in-the-Loop Reinforcement Learning},
      author={Jianlan Luo and Charles Xu and Jeffrey Wu and Sergey Levine},
      year={2024},
      eprint={2410.21845},
      archivePrefix={arXiv},
      primaryClass={cs.RO}
}

Name		Name	Last commit message	Last commit date
Latest commit History 82 Commits
.github/workflows		.github/workflows
docs		docs
examples		examples
serl_launcher		serl_launcher
serl_robot_infra		serl_robot_infra
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

WSRL on robot

Installation

File Structure

Running WSRL on Franka Peg Insertion

HIL-SERL: Precise and Dexterous Robotic Manipulation via Human-in-the-Loop Reinforcement Learning

Installation

Overview and Code Structure

Run with Franka Arm

Citation

About

Uh oh!

Releases

Packages

Languages

License

zhouzypaul/wsrl-robot

Folders and files

Latest commit

History

Repository files navigation

WSRL on robot

Installation

File Structure

Running WSRL on Franka Peg Insertion

HIL-SERL: Precise and Dexterous Robotic Manipulation via Human-in-the-Loop Reinforcement Learning

Installation

Overview and Code Structure

Run with Franka Arm

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages