Generalizable Hierarchical Skill Learning via Object-Centric Representation
Haibo Zhao,
Yu Qi,
Boce Hu,
Yizhe Zhu,
Ziyan Chen,
Xupeng Zhu,
Owen Howell,
Haojie Huang,
Robin Walters,
Dian Wang*,
Robert Platt*
RA-L, under review
project page /
paper
We introduce a hierarchical skill learning framework that leverages vlm/mllm agent as high level, diffusion-based policy as low level, and object-centric representation
to generalize across spatial arrangements, object instances, and novel tasks using minimal demonstrations.
|
BEAR: Benchmarking and Enhancing Multimodal Language Models for Atomic Embodied Capabilities
Yu Qi*,
Haibo Zhao*,
Ziyu Guo*,
Siyuan Ma,
Ziyan Chen,
Yaokun Han,
Renrui Zhang,
Zitiantao Lin,
Shiji Xin,
Yijian Huang,
Kai Cheng,
Peiheng Wang,
Jiazheng Liu,
Jiayi Zhang,
Yizhe Zhu,
Wenqing Wang,
Yiran Qin,
Xupeng Zhu,
Haojie Huang,
Lawson L.S. Wong
ICLR, under review
project page /
paper
We introduce BEAR, a large-scale benchmark and evaluation framework for assessing multimodal large language models
on step-wise embodied capabilities. BEAR spans perception, reasoning, and action understanding across 4,469 interleaved
image–video–text samples, providing a comprehensive analysis of atomic embodied skills.
|
Hierarchical Equivariant Policy via Frame Transfer
Haibo Zhao*,
Dian Wang*,
Yizhe Zhu,
Xupeng Zhu,
Owen Howell,
Linfeng Zhao,
Yaoyao Qian,
Robin Walters,
Robert Platt
ICML, 2025
project page /
paper
By introducing a frame transfer interface, we impose a soft constraint from the high-level open-loop
policy onto the low-level closed-loop policy, combining the strengths of both approaches.
|
Equivariant Diffusion Policy
Dian Wang,
Stephen Hart,
David Surovik,
Tarik Kelestemur,
Haojie Huang,
Haibo Zhao,
Mark Yeatman,
Jiuguang Wang,
Robin Walters,
Robert Platt
CoRL, 2024 (Best Paper Finalist)
project page /
paper
|
|