CLIP feature-based randomized control using images and text for multiple tasks and robots
This article presents a new control framework using vision language models (VLMs) for multiple tasks and robots. The authors combine the vision-language CLIP model with randomized control. This framework aims…
Continue reading