The paper presents ReFlow-TTS, a new method for text-to-speech (TTS) synthesis offering high-fidelity speech synthesis. Unlike traditional models that require numerous sampling steps, ReFlow-TTS simplifies the process using an Ordinary Differential Equation (ODE) model to transport Gaussian distribution to the Mel-spectrogram distribution. This approach allows high-quality speech synthesis with a single sampling step and eliminates the need for training a teacher model. Experimental results show that ReFlow-TTS outperforms other diffusion-based models and is competitive with existing one-step TTS models.

 

Publication date: 4 Oct 2023
Project Page: Not provided
Paper: https://arxiv.org/pdf/2309.17056