VideoGLUE: Video General Understanding Evaluation of Foundation Models
This research evaluates the video understanding capabilities of existing foundation models (FMs) using a specially designed experiment protocol, VideoGLUE. The study uses three key tasks (action recognition, temporal localization, and…
Continue reading