February 23, 2024

PeriodGrad: Towards Pitch-Controllable Neural Vocoder Based on a Diffusion Probabilistic Model

The article presents research on a neural vocoder based on a denoising diffusion probabilistic model (DDPM), which incorporates explicit periodic signals as auxiliary conditioning signals. The proposed model can generate high-quality waveforms and offers improved pitch control. This development is particularly significant for applications such as speech and singing voice synthesis. The study suggests that this model outperforms conventional DDPM-based neural vocoders in terms of sound quality and pitch control.

Publication date: 23 Feb 2024
Project Page: Not provided
Paper: https://arxiv.org/pdf/2402.14692

Post Views: 260

root

Exit mobile version

Please allow ads on our site

Looks like you're using an ad blocker. Please support us by disabling these ad blocker.

Press ESC to close

Share Article:

root

Compression Robust Synthetic Speech Detection Using Patched Spectrogram Transformer

Avoiding an AI-imposed Taylor’s Version of all music history

Please allow ads on our site