I focus on embodied manipulation. I am currently on a gap year, and will join the MSE in Robotics at the University of Pennsylvania next year. I'm at Xi'an Jiaotong-Liverpool University (XJTLU) advised by Prof. Yong Yue and Prof. Yaran Chen, and I previously interned at Westlake Robotics and at Nanjing University advised by Prof. Shangke Lyu.
Build an embodied system that can robustly generalize and complete long-horizon, complex manipulation tasks in open environments.
I aim to model how embodied agents understand and represent the world, and to develop systems that can interact with the environment in a manner consistent with such internal representations.
I study why Vision-Language-Action (VLA)-based manipulation systems struggle on long-horizon, complex, open-environment tasks: they often lack stable representations of task-relevant state, temporal dependencies, and transition conditions, limiting robustness, generalization, and success rates.
My work aims to introduce stronger state modeling and decision-support mechanisms for VLA.
- world modeling
- state representation & usage
- physical reasoning