Image-Language Models Papers

Distilling Vision-Language Models on Millions of Videos

root January 12, 2024 0

The research aims to replicate the success of image-text data for video-language models. The researchers fine-tuned a video-language model from a strong image-language baseline with synthesized instructional data. The adapted…

Press ESC to close

Image-Language Models

Please allow ads on our site