The article presents MERBench, a unified evaluation benchmark for multimodal emotion recognition. It addresses the problem of inconsistencies in feature extractors, evaluation methods, and experimental settings in emotion recognition research. The authors aim to provide clear and comprehensive guidance for future researchers by revealing the contributions of various techniques employed in previous works, including feature selection, multimodal fusion, robustness analysis, fine-tuning, pre-training, and more. They also introduce a new emotion dataset, MER2023, focused on the Chinese language environment.
Publication date: 9 Jan 2024
Project Page: Not provided
Paper: https://arxiv.org/pdf/2401.03429