The study discusses the challenges of sound event localization and detection (SELD) methods due to environment shifts and conflicts. It proposes an environment-adaptive Meta-SELD that uses minimal data for efficient adaptation to new environments. The method employs Model-Agnostic Meta-Learning (MAML) on a pre-trained, environment-independent model and uses fast adaptation to unseen real-world environments using limited samples. The method introduces the concept of selective memory for resolving conflicts across environments by selectively memorizing target-environment-relevant information and adapting to the new environments through the selective attenuation of model parameters. The study also introduces environment representations to characterize different acoustic settings, enhancing the adaptability of the attenuation approach to various environments. The method was evaluated on the development set of the Sony-TAU Realistic Spatial Soundscapes 2023 (STARSS23) dataset and computationally synthesized scenes, demonstrating superior performance compared to conventional supervised learning methods, particularly in localization.
Publication date: 29 Dec 2023
Project Page: Not provided
Paper: https://arxiv.org/pdf/2312.16422