The research explores the use of models such as BERT and HuBERT in dimensional speech emotion recognition. Despite their large dimensionality, these models are frequently used for emotion recognition tasks, often leading to high computational costs. The authors found that there are lower-dimensional subspaces within these pre-trained representational spaces that can reduce model complexity without compromising performance in emotion estimation. They also investigated label uncertainty, demonstrating that incorporating such information can improve the model’s generalization capacity and robustness. The study also found that the reduced dimensional representations retained performance similar to the full-dimensional representations without significant regression in dimensional emotion performance.

 

Publication date: 29 Dec 2023
Project Page: Unavailable
Paper: https://arxiv.org/pdf/2312.16180