A Hybrid Graph Network for Complex Activity Detection in Video

The study deals with the complex activity detection (CompAD) in videos, a field that extends the analysis of actions in a video to long-term activities. The researchers propose a hybrid graph network that combines attention applied to a graph encoding the local (short-term) dynamic scene with a temporal graph modelling the overall long-duration activity. A novel feature extraction technique is introduced that generates spatiotemporal tubes for the active elements (agents) in the local scene. This technique detects individual objects, tracks them, and then extracts 3D features from all the agent tubes as well as the overall scene. The proposed framework outperforms all previous state-of-the-art methods on all three datasets including ActivityNet-1.3, Thumos-14, and ROAD.

Publication date: 27 Oct 2023
Project Page: Not provided
Paper: https://arxiv.org/pdf/2310.17493

Post Views: 312

A Hybrid Graph Network for Complex Activity Detection in Video

root

Leave a Reply Cancel reply

Press ESC to close

Share Article:

root

Revisiting the Distillation of Image Representations into Point Clouds for Autonomous Driving

Cross-modal Active Complementary Learning with Self-refining Correspondence

Leave a Reply Cancel reply

Please allow ads on our site