ANIM-400K: A Large-Scale Dataset for Automated End-To-End Dubbing of Video
The article introduces ‘Anim-400K’, a large-scale dataset designed to aid in the automated end-to-end dubbing of video…
The article introduces ‘Anim-400K’, a large-scale dataset designed to aid in the automated end-to-end dubbing of video…
This article discusses a comparative study investigating the performances of three models: a proposed convolutional neural network…
DiffSHEG offers a solution for speech-driven holistic 3D expression and gesture generation. Unlike previous research that focused…
The article introduces MuTox, a multilingual audio-based toxicity detection system. It’s the first of its kind, with…
The article discusses a framework for training singer identity encoders that can extract representations suitable for singing-related…
This article discusses the development of a noise-robust zero-shot text-to-speech (TTS) method. The method, based on speaker…
The paper discusses a study investigating how a robot’s affective narrative impacts its ability to elicit empathy…
This paper presents a novel model for a robot’s behavioral adaptation in its long-term interaction with humans,…
The article ‘Advancing GUI for Generative AI: Charting the Design Space of Human-AI Interactions through Task Creativity…
This research investigates how decisions to adapt or exchange a solution depend on the structural and surface…