The paper discusses the challenge of training general robotic policies from heterogeneous data for various tasks. Current methods often use data from one domain, which is both expensive and difficult. The authors present a flexible approach, called Policy Composition, that combines information across diverse modalities and domains for learning generalized manipulation skills. This method can be used for multi-task manipulation and can be composed with analytic cost functions to adapt policy behaviors at inference time. The method was trained on simulation, human, and real robot data and tested in tool-use tasks. The composed policy demonstrated robust and dexterous performance under varying scenes and tasks, outperforming baselines from a single data source in both simulation and real-world experiments.

 

Publication date: 6 Feb 2024
Project Page: https://liruiw.github.io/policycomp
Paper: https://arxiv.org/pdf/2402.02511