The paper presents RoboFlamingo, a novel framework for robot manipulation using vision-language models (VLMs). The researchers used pre-trained VLMs and fine-tuned them on robotics data, resulting in a system that comprehends visual-language instructions and performs actions accordingly. The system is designed to be easy-to-use and cost-effective, making it a potential tool for widespread use in robotics manipulation. Experiments show that RoboFlamingo outperforms existing methods, suggesting its potential as an effective solution for robot control.

 

Publication date: 3 Nov 2023
Project Page: roboflamingo.github.io
Paper: https://arxiv.org/pdf/2311.01378