Hi, and thank you for developing such an impressive tool for training agents.
I am currently preparing a motion dataset to train a humanoid agent to perform a construction-related task. In our pilot tests, we used motion data from the CMU Mocap database to train the agent of two tasks (1) an agent to walk to a location, and (2) reaching his hand to a low position nearby. However, we found that the training performance is highly dependent on the quality of the dataset, and our current data does not seem sufficient for the agent to properly learn the task.
In this pilot test, our dataset includes idle poses, walking, turning, and bending motions for picking up a box (all from CMU data). We made some clipping to keep the main frames in the motion clips. We do the getup pretraining to help the agent learn the skills. It seem it could learn all the skills. However, after we do the task training, the result is not as expected. For instance, moving from walking to turning agent's direction looks strange; transitioning from walking to bending looked strange as well.
We suspect that these issues are caused by limitations in our motion dataset preparation. Could you please share any suggestions or best practices for preparing motion datasets for imitation learning in humanoid agents?
Thank you in advance!