The research focuses on efficient score-based black-box adversarial attacks with a high Attack Success Rate (ASR) and good generalizability. The authors propose a novel attack method, DifAttack, which operates over a disentangled feature space, unlike existing methods that operate over the entire feature space. DifAttack disentangles an image’s latent feature into an adversarial feature and a visual feature. The adversarial feature dominates the adversarial capability of an image, while the visual feature determines its visual appearance. This method demonstrates significant improvements in ASR and query efficiency, especially in the targeted attack and open-set scenarios.

 

Publication date: 28 Sep 2023
Project Page: https://github.com/csjunjun/DifAttack.git
Paper: https://arxiv.org/pdf/2309.14585