The paper discusses a novel framework for first-shot unsupervised anomalous sound detection (ASD). It is a task introduced in DCASE 2023 Challenge Task 2. The traditional ASD methods, which rely on the availability of normal and abnormal sound data, face challenges due to the lack of anomalous sound data. To overcome this, the authors propose a metadata-assisted audio generation model to estimate unknown anomalies. The model uses machine information to fine-tune a text-to-audio generation model. The generated anomalous sounds contain unique acoustic characteristics for different machine types. The Time-Weighted Frequency domain audio Representation with Gaussian Mixture Model (TWFR-GMM) method is used for detection. The proposed method shows competitive performance while requiring only 1% model parameters for detection.

 

Publication date: 25 Oct 2023
Project Page: Not provided
Paper: https://arxiv.org/pdf/2310.14173