Phoneme-Based Proactive Anti-Eavesdropping with Controlled Recording Privilege
This study presents a new system designed to protect against eavesdropping by jamming microphones with a unique…
This study presents a new system designed to protect against eavesdropping by jamming microphones with a unique…
The article introduces Synchformer, a new model for audio-visual synchronization focused on ‘in-the-wild’ videos, such as those…
The research presents AMuSE (Adaptive Multimodal Analysis for Speaker Emotion), a model developed for recognizing individual emotions…
Music auto-tagging is key for improving music discovery and recommendation. Existing models in Music Information Retrieval (MIR)…
This paper discusses the limitations of current masked audio modeling (MAM) methods and presents a new method…
The article presents a framework for continuous target speaker extraction (C-TSE), which aims to refine the process…
The paper introduces an algorithm for localizing mono-frequent uniformly moving sound sources, operating entirely in the frequency…
The article presents a system that uses spatial-temporal activity for multichannel speaker diarization and separation. The architecture…
The article introduces the PBSCSR dataset, a resource for studying composer style recognition in piano sheet music….
The article discusses the need for objective metrics in evaluating speech generation. The authors propose new reference-aware…