Learning-To-Rank Approach for Identifying Everyday Objects Using a Physical-World Search Engine

The paper presents a novel approach, MultiRankIt, for a task defined as Learning-to-Rank Physical Objects (LTRPO). The task involves retrieving target objects from open-vocabulary user instructions in a human-in-the-loop setting. The approach uses a Crossmodal Noun Phrase Encoder and a Crossmodal Region Feature Encoder to model relationships between phrases, target objects, and their contextual environment. The approach is tested on a new dataset with complex instructions and real indoor environmental images, outperforming the baseline method. The study also includes physical experiments with a domestic service robot in a real-world setting, achieving an 80% success rate for object retrieval.

Publication date: 29 Dec 2023
Project Page: https://github.com/keio-smilab23/MultiRankIt
Paper: https://arxiv.org/pdf/2312.15844

Post Views: 278

Learning-To-Rank Approach for Identifying Everyday Objects Using a Physical-World Search Engine

root

Leave a Reply Cancel reply

Press ESC to close

Share Article:

root

Recursive Distillation for Open-Set Distributed Robot Localization

A Closed-Loop Multi-perspective Visual Servoing Approach with Reinforcement Learning

Leave a Reply Cancel reply

Please allow ads on our site