Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding Decomposition
The study presents Daisy-TTS, a text-to-speech system that simulates a broad spectrum of emotions. It uses a prosody encoder to learn emotionally-separable prosody embedding, which acts as a proxy for…
Continue reading