This study from Korea Advanced Institute of Science and Technology presents a SlowFast network designed to improve Continuous Sign Language Recognition (CSLR). The network uses two pathways operating at different temporal resolutions to separately capture spatial (hand shapes, facial expressions) and dynamic (movements) information. The authors introduce two feature fusion methods, Bi-directional Feature Fusion and Pathway Feature Enhancement, to facilitate the transfer of dynamic semantics into spatial semantics and vice versa. The proposed framework outperforms current models on popular CSLR datasets, including PHOENIX14, PHOENIX14-T, and CSL-Daily.

 

Publication date: 22 Sep 2023
Project Page: Not provided
Paper: https://arxiv.org/pdf/2309.12304