DurIAN-E: Duration Informed Attention Network For Expressive Text-to-Speech Synthesis
This paper introduces DurIAN-E, an improved duration informed attention neural network for expressive and high-quality text-to-speech synthesis. DurIAN-E uses multiple stacked SwishRNN-based Transformer blocks as linguistic encoders and incorporates Style-Adaptive…
Continue reading