The researchers propose a novel model called Dual-Phase Audio Transformer for Denoising (DPATD) to address the challenges of time-domain speech enhancement systems. DPATD splits the audio input into smaller chunks, making it easier to model longer sequences. The model utilizes a memory-compressed explainable attention mechanism that is efficient and converges faster compared to the frequently used self-attention module. The model showed superior performance in comparison to other state-of-the-art methods in audio denoising.

 

Publication date: 3 Nov 2023
Project Page: Not Provided
Paper: https://arxiv.org/pdf/2310.19588