Imitation learning is supervised learning where data comes as expert demonstration. The expert can be a human or any other agent. Input data is referred to as "state" and output data as "action." In discrete action spaces, it resembles classification; in continuous action spaces, it is regression.
Policy 
Behavioral Cloning (BC) is offline imitation learning that use only the collected demonstrations and doesn't use simulator during learning.
- This tutorial is educational purpose, so code isn't optimized for production but easy to understand.
- Each policy training is done in a single jupyter notebook.
- Each directory contains a readme file.
| Video | Task | State Space | Action Space | Expert | Colab | 
|---|---|---|---|---|---|
|  | MountainCar-v0 | Continuous(2) | Discrete(3) | Human | Open In Colab | 
|  | Pendulum-v1 | Continuous(3) | Continuous(1) | RL | Open In Colab | 
|  | CarRacing-v2 | Image(96x96x3) | Continuous(3) | Human | Open In Colab | 
|  | Ant-v3 | Continuous(111) | Continuous(8) | RL | todo | 
|  | Lift | Continuous(multi-modal) | Continuous(7) | Human | Open In Colab | 
- use the "Open In Colab" links above to run the code in colab.
- please see the readme file in each directory for installation and data collection instructions.
- We use hdf5 file for robomimic (see the 'readme.md' in robomimic directory to understand the data format) and real robot.
- For rest of the environment we store as *.pkl file.
- Please see the respective folders (e.g. robomimic_tasks) for data collection instructions.