November 4, 2023

DCHT: Deep Complex Hybrid Transformer for Speech Enhancement

The article presents a new deep complex hybrid transformer for speech enhancement, which combines approaches from both the spectrogram and waveform domains. The model, comprised of a complex Swin-Unet in the spectrogram domain and a dual-path transformer network in the waveform domain, learns multi-domain features to reduce noise. It shows improved performance on the BirdSoundsDenoising and VCTK+DEMAND datasets. The study suggests this hybrid approach can enhance the quality and intelligibility of speech.

Publication date: 3 Nov 2023
Project Page: Not Provided
Paper: https://arxiv.org/pdf/2310.19602

Post Views: 331

root

Exit mobile version

Please allow ads on our site

Looks like you're using an ad blocker. Please support us by disabling these ad blocker.

Press ESC to close

Share Article:

root

Intelligibility prediction with a pretrained noise-robust automatic speech recognition model

DPATD: Dual-Phase Audio Transformer for Denoising

Please allow ads on our site