This study presents a novel framework called Domain-Aware Prompt learning (DAP) to help autonomous agents navigate in unseen environments using language instructions. The study highlights the challenge that most pretrained models are trained on general-purpose datasets, creating a domain gap when used for vision-and-language navigation tasks. DAP addresses this by learning soft visual prompts for extracting in-domain image semantics. Experiments show that DAP outperforms existing methods.
Publication date: 30 Nov 2023
Project Page: Not provided
Paper: https://arxiv.org/pdf/2311.17812
Leave a comment