The paper proposes a novel navigation framework for vision-and-language navigation (VLN) in real-world scenarios. Traditional VLN methods are mainly evaluated in simulations, which often overlook the complexities of the real world. The proposed framework includes an instruction parser, a real-time visual-language mapper, a localizer, and a local controller. This framework helps in maintaining a spatial and semantic understanding of the unseen environment, improving the efficiency of navigation. The implemented pipeline was evaluated on an Interbotix LoCoBot WX250 in an unseen lab environment and showed significant improvements over the existing methods.
Publication date: 18 Oct 2023
Project Page: Not provided
Paper: https://arxiv.org/pdf/2310.10822