September 25, 2023

TMac: Temporal Multi-Modal Graph Learning for Acoustic Event Classification

The study proposes a new method called TMac for acoustic event classification. This method uses temporal multi-modal graph learning to improve the processing of audiovisual data in deep learning models. TMac constructs a temporal graph for each acoustic event, dividing its audio and video data into multiple segments. Each segment or ‘node’ has temporal relationships or ‘timestamps’ on their edges, allowing dynamic information capture. The method outperformed other state-of-the-art models in performance.

Publication date: 25 Sep 2023
Project Page: https://github.com/MGitHubL/TMac
Paper: https://arxiv.org/pdf/2309.11845

Post Views: 345

root

Exit mobile version

Please allow ads on our site

Looks like you're using an ad blocker. Please support us by disabling these ad blocker.

Press ESC to close

Share Article:

root

A Discourse-level Multi-scale Prosodic Model for Fine-grained Emotion Analysis

Deepfake audio as a data augmentation technique for training automatic speech to text transcription models

Please allow ads on our site