TACNET: Temporal Audio Source Counting Network

The authors introduce the Temporal Audio Source Counting Network (TaCNet), a new architecture that addresses issues in audio source counting tasks. TaCNet works directly on raw audio inputs, removing the need for complex preprocessing steps and simplifying the overall process. The network performs particularly well in real-time speaker counting, even when input windows are truncated. The evaluation of TaCNet, conducted using the LibriCount dataset, demonstrates its superior performance, marking it as a state-of-the-art solution for audio source counting tasks. With an average accuracy of 74.18% over 11 classes, TaCNet has proven its effectiveness across various scenarios, including applications in Chinese and Persian languages, showcasing its versatility and potential impact.

Publication date: 4 Nov 2023
Project Page: https://arxiv.org/abs/2311.02369
Paper: https://arxiv.org/pdf/2311.02369

Post Views: 339

TACNET: Temporal Audio Source Counting Network

root

Leave a Reply Cancel reply

Press ESC to close

Share Article:

root

Generalized zero-shot audio-to-intent classification

Design Of Rubble Analyzer Probe Using ML For Earthquake

Leave a Reply Cancel reply

Please allow ads on our site