The article presents a new method for estimating robot pose from RGB images, even when robot internal states are not known. This is a crucial problem in robotics and computer vision, as most existing methods either require full knowledge of these states or are too computationally heavy for real-time applications. The proposed solution is an end-to-end pipeline that breaks down the problem into estimating camera-to-robot rotation, robot state parameters, keypoint locations, and root depth, with a neural network module designed for each task. This approach enables learning multi-facet representations and facilitates sim-to-real transfer through self-supervised learning. The method delivers a 12 speed boost with state-of-the-art accuracy, making real-time holistic robot pose estimation possible for the first time.
Publication date: 8 Feb 2024
Project Page: https://oliverbansk.github.io/Holistic-Robot-Pose/
Paper: https://arxiv.org/pdf/2402.05655