This article deals with Code-switching (CS), the phenomenon of mixing languages in a single sentence, and its challenges in Natural Language Processing (NLP) settings. The study extends existing research on CS speech translation, particularly focusing on two unexplored areas: streaming settings and translation to a third language. To address these challenges, the authors extend the Fisher and Miami test and validation datasets to include new targets in Spanish and German and train a model for both offline and streaming Speech Translation. The research provides baseline results for these two settings, contributing to the development of real-world applications of machine learning in speech technologies.

 

Publication date: 20 Oct 2023
Project Page: Not provided
Paper: https://arxiv.org/pdf/2310.12648