October 21, 2023

Query-aware Long Video Localization and Relation Discrimination for Deep Video Understanding

This paper presents a method to improve the understanding and analysis of long-format videos. The authors propose a query-aware method for localizing and understanding relations in long videos by using an image-language pretrained model. The model selects frames relevant to specific queries, eliminating the need for a complete movie-level knowledge graph. The approach outperforms in different types of queries, demonstrating its effectiveness and robustness.

Publication date: 20 Oct 2023
Project Page: https://doi.org/10.1145/3581783.3612871
Paper: https://arxiv.org/pdf/2310.12724

Post Views: 285

Deep video understanding, Knowledge Base Question Answering, Long video localization, Multimodal analysis, Relation discrimination

Query-aware Long Video Localization and Relation Discrimination for Deep Video Understanding

root

Leave a Reply Cancel reply

Press ESC to close

Share Article:

root

ExtSwap: Leveraging Extended Latent Mapper for Generating High Quality Face Swapping

Recoverable Privacy-Preserving Image Classification through Noise-like Adversarial Examples

Leave a Reply Cancel reply

Please allow ads on our site