One-Shot Open Affordance Learning with Foundation Models

This paper introduces a concept called One-shot Open Affordance Learning (OOAL), where a model is trained with just one example per base object category, but is expected to identify novel objects and their affordances. The authors conduct a comprehensive analysis of existing foundation models to explore their understanding of affordances and assess the potential for data-limited affordance learning. A new vision-language framework is proposed that boosts the alignment between visual features and affordance text embeddings. This method outperforms state-of-the-art models with less than 1% of the full training data, showing good generalization capability on unseen objects and affordances.

Publication date: 29 Nov 2023
Project Page: https://arxiv.org/abs/2311.17776v1
Paper: https://arxiv.org/pdf/2311.17776

Post Views: 303

Press ESC to close

Share Article:

root

Aggregation Model Hyperparameters Matter in Digital Pathology

PillarNeSt: Embracing Backbone Scaling and Pretraining for Pillar-based 3D Object Detection

Please allow ads on our site