Hierarchical Equivariant Policy via Frame Transfer

Northeastern University; Robotics and AI Institute
ICML 2025

*Indicates Equal Contribution

Abstract

Recent advances in hierarchical policy learning highlight the advantages of decomposing systems into high-level and low-level agents, enabling efficient long-horizon reasoning and precise fine-grained control. However, the interface between these hierarchy levels remains underexplored, and existing hierarchical methods often ignore domain symmetry, resulting in the need for extensive demonstrations to achieve robust performance. To address these issues, we propose Hierarchical Equivariant Policy (HEP), a novel hierarchical policy framework. We propose a frame transfer interface for hierarchical policy learning, which uses the high-level agent's output as a coordinate frame for the low-level agent, providing a strong inductive bias while retaining flexibility. Additionally, we integrate domain symmetries into both levels and theoretically demonstrate the system's overall equivariance. HEP achieves state-of-the-art performance in complex robotic manipulation tasks, demonstrating significant improvements in both simulation and real-world settings.

Method Overview

HEP Overview

We introduce Hierarchical Equivariant Policy (HEP), a novel framework for efficient and generalizable robotic manipulation. HEP is built on two core ideas: a hierarchical policy structure and a new Frame Transfer interface that enables seamless generalization and robustness.

Hierarchical Policy Structure

  • High-level policy: Responsible for global, long-horizon planning by predicting a “keypose” (i.e., a target 3D translation) that serves as a subgoal.
  • Low-level policy: Generates fine-grained motion trajectories in a local coordinate frame anchored at the keypose.

This separation allows the high-level policy to focus on strategy while the low-level handles precise control, greatly reducing complexity.

Frame Transfer Interface

Instead of hard constraints, the high-level outputs a reference translation defining a local frame. The low-level operates relative to this frame, providing:

  • Soft constraints: Local trajectory optimization.
  • Passing Generalization Ability: We prove that high level's generalization ability can be passed to low level.

Experiments

Our method is tested on 30 simulated tasks on RLBench—including high‑precision tasks, long‑horizon tasks, and articulated object manipulation—and further validated on a real robot across 3 real‑world manipulation tasks. Check our paper for detailed results.

BibTeX

@article{zhao2025hierarchical,
  title={Hierarchical Equivariant Policy via Frame Transf},
  author={Zhao, Haibo and Wang, Dian and Zhu, Yizhe and Zhu, Xupeng and Howell, Owen and Zhao, Linfeng and Qian, Yaoyao and Walters, Robin and Platt, Robert},
  journal={arXiv preprint arXiv:2502.05728},
  year={2025}
}