This research paper by Apple investigates the use of different modeling strategies for server-side rescoring of spoken queries in virtual assistants powered by Automatic Speech Recognition (ASR). The study demonstrates a significant improvement in Word Error Rates (WER) when integrating server-side language models as compared to performing ASR on-device only. It also compares models trained on domain data and a GPT-3 variant by OpenAI. The paper further discusses the model fusion of multiple server-side language models that combines the strengths of each model effectively.

 

Publication date: 3 Nov 2023
Project Page: not provided
Paper: https://arxiv.org/pdf/2311.01398