The article presents a study on the use of Byte Pair Encoding (BPE) for automatic Bengali speech recognition. BPE emerges as an effective tokenization method for tackling the out-of-vocabulary (OOV) challenge in various natural language and speech processing tasks. The study identifies the optimal number of BPE tokens for Bengali, a language known for its morphological complexity. Experimental evaluation reveals that approximately 500-1000 tokens result in superior OOV performance. The introduction of BPE tokenization to Bengali ASR achieves a substantial reduction in the word error rate.
Publication date: 31 Jan 2024
Project Page: Not provided
Paper: https://arxiv.org/pdf/2401.15532