The article presents ‘FreeTalker’, a novel framework that generates both spontaneous and non-spontaneous speaker motions, thus improving the naturalness and controllability of talking avatars. Unlike previous models, which only considered the gestures based on the audio and text of the utterance, FreeTalker also considers the non-speaking motion of the speaker. The model is trained using a diffusion-based model that represents both speech-driven gestures and text-driven motions. It also uses a method called DoubleTake to ensure seamless motion blending. The results of the experiments show that FreeTalker can generate natural and controllable speaker movements.

 

Publication date: 9 Jan 2024
Project Page: https://youngseng.github.io/FreeTalker/
Paper: https://arxiv.org/pdf/2401.03476