The Effect of Batch Size on Contrastive Self-Supervised Speech Representation Learning

The article focuses on the impact of batch size on pre-training in self-supervised speech representation learning. The study uses varying batch sizes and observes that larger batch sizes result in better pre-trained models, provided the limitations regarding stability and effectiveness are respected. The quality of the pre-trained model is found to depend mainly on the amount of speech data seen during training, which is a product of batch size and number of iterations. These insights can help researchers choose effective operating conditions when studying self-supervised learning in speech.

Publication date: 23 Feb 2024
Project Page: https://github.com/nikvaessen/w2v2-batch-size
Paper: https://arxiv.org/pdf/2402.13723

Post Views: 267

The Effect of Batch Size on Contrastive Self-Supervised Speech Representation Learning

root

Leave a Reply Cancel reply

Press ESC to close

Share Article:

root

Music Style Transfer with Time-Varying Inversion of Diffusion Models

Structure-informed Positional Encoding for Music Generation

Leave a Reply Cancel reply

Please allow ads on our site