Learning to Prompt with Text Only Supervision for Vision-Language Models
Vision-Language Models like CLIP are becoming prevalent due to their excellent generalization abilities. However, adapting them for downstream tasks remains challenging. One method involves learning prompts using visual information, but…
Continue reading