A Systematic Survey of Prompt Engineering on Vision-Language Foundation Models
This paper provides a comprehensive survey of prompt engineering on three types of vision-language models: multimodal-to-text generation models, image-text matching models, and text-to-image generation models. Prompt engineering, the technique of…
Continue reading