Remixed2Remixed: Domain adaptation for speech enhancement by Noise2Noise learning with Remixing

The article presents a new domain adaptation method for speech enhancement, termed as Remixed2Remixed. It uses Noise2Noise learning to adapt models trained on artificially generated noisy-clean pair data to enhance real-world noisy data. The method employs a teacher model trained on out-of-domain data to get pseudo-in-domain speech and noise signals. These signals are then shuffled and remixed twice in each batch to generate two bootstrapped mixtures. The student model is trained using an N2N-based cost function computed from these mixtures. The method outperformed the existing systems in tests on the CHiME-7 unsupervised domain adaptation task for conversational speech enhancement.

Publication date: 29 Dec 2023
Project Page: https://www.cyberagent.co.jp/en/news/detail/id=24461
Paper: https://arxiv.org/pdf/2312.16836

Post Views: 254

Remixed2Remixed: Domain adaptation for speech enhancement by Noise2Noise learning with Remixing

root

Leave a Reply Cancel reply

Press ESC to close

Share Article:

root

Accent-VITS:accent transfer for end-to-end TTS

Self-supervised Pretraining for Robust Personalized Voice Activity Detection in Adverse Conditions

Leave a Reply Cancel reply

Please allow ads on our site