The research presents a novel approach to Automatic Pronunciation Assessment (APA), which quantifies a second language learner’s pronunciation proficiency. Traditional models often use a regression loss function like mean-squared error (MSE) loss for proficiency prediction. These models capture proficiency levels but fail to preserve phonemic distinctions. The researchers propose a new phonemic contrast ordinal (PCO) loss for training regression-based APA models. This approach aims to retain better phonemic distinctions between phoneme categories while considering ordinal relationships of the regression output. The model was tested on the speechocean762 benchmark dataset and showed promising results.
Publication date: 4 Oct 2023
Project Page: Not Provided
Paper: https://arxiv.org/pdf/2310.01839