The research paper presents a pipeline for real-time 3D semantic scene perception for egocentric robots with binocular vision. The pipeline includes instance segmentation, feature matching, and point-set registration. The researchers tested the pipeline on a 7-DOF dual-arm Baxter robot equipped with an Intel RealSense D435i RGB-D camera. The robot was able to segment objects of interest, register multiple views while moving, and grasp the target object. The source code for this project is available on GitHub.

 

Publication date: 20 Feb 2024
Project Page: https://github.com/mkhangg/semantic_scene_perception
Paper: https://arxiv.org/pdf/2402.11872