Rethinking and Improving Multi-task Learning for End-to-end Speech Translation

The article discusses the application of multi-task learning (MTL) in end-to-end speech translation (ST). It investigates the consistency between different tasks in MTL and their effect on the ST task. The authors propose an improved MTL (IMTL) approach for ST, which bridges the gap between different modalities by mitigating the difference in length and representation. The results from experiments on the MuST-C dataset show that the proposed method attains state-of-the-art results. The article also underscores the need for fine-tuning in MTL to achieve optimal performance.

Publication date: 7 Nov 2023
Project Page: https://arxiv.org/pdf/2311.03810v1.pdf
Paper: https://arxiv.org/pdf/2311.03810

Post Views: 326

root

Leave a Reply Cancel reply

Press ESC to close

Share Article:

root

Can CLIP Help Sound Source Localization?

Are Words Enough? On the semantic conditioning of affective music generation

Leave a Reply Cancel reply

Please allow ads on our site