The article presents SemiReward, a new framework for semi-supervised learning (SSL). The main challenge of SSL is to distinguish high-quality pseudo labels against the confirmation bias. Current pseudo-label selection strategies are either pre-defined or complex, and fail to achieve high-quality labels, fast convergence, and task versatility. SemiReward predicts reward scores to evaluate and filter out high-quality pseudo labels. It can be used with mainstream SSL methods. SemiReward is trained online in two stages with a generator model and subsampling strategy. It has been tested on 13 standard SSL benchmarks and shown significant performance gains and faster convergence speeds.
Publication date: 4 Oct 2023
Project Page: https://arxiv.org/abs/2310.03013
Paper: https://arxiv.org/pdf/2310.03013