This paper introduces SoftEDA, a novel technique that applies soft labels to augmented data. Traditional Easy Data Augmentation (EDA) techniques can potentially damage the original meaning of the text, hurting the model’s performance. To address this, SoftEDA incorporates noise-to-label values of the noisy augmented data. The method was tested across seven different classification tasks, demonstrating its effectiveness compared to traditional EDA and an alternative method, AEDA. This is the first study to introduce soft labels into rule-based text data augmentation methods.
Publication date: 8 Feb 2024
Project Page: https://github.com/c-juhwan/SoftEDA
Paper: https://arxiv.org/pdf/2402.05591