This academic article delves into the security risks of unlabeled data in model adaptation, focusing on potential backdoor attacks. The authors provide two backdoor triggers and two poisoning strategies, achieving a high success rate while maintaining normal performance on clean samples. In response, they propose a plug-and-play method called MIXADAPT, designed to defend against backdoor embedding. The effectiveness of this method is demonstrated across various benchmarks and adaptation methods, hoping to shed light on the safety of learning with unlabeled data.

 

Publication date: 11 Jan 2024
Project Page: https://arxiv.org/abs/2401.06030v1
Paper: https://arxiv.org/pdf/2401.06030