The article discusses the application of multi-task learning (MTL) in end-to-end speech translation (ST). It investigates the consistency between different tasks in MTL and their effect on the ST task. The authors propose an improved MTL (IMTL) approach for ST, which bridges the gap between different modalities by mitigating the difference in length and representation. The results from experiments on the MuST-C dataset show that the proposed method attains state-of-the-art results. The article also underscores the need for fine-tuning in MTL to achieve optimal performance.
Publication date: 7 Nov 2023
Project Page: https://arxiv.org/pdf/2311.03810v1.pdf
Paper: https://arxiv.org/pdf/2311.03810