Segment (Almost) Nothing: Prompt-Agnostic Adversarial Attacks on Segmentation Models

The article by Francesco Croce and Matthias Hein from University of Tübingen AI Center proposes a new approach to adversarial attacks on image segmentation models. These models typically generate segmentation masks from various prompts. Existing adversarial attacks target end-to-end tasks, altering the segmentation mask predicted for a specific image-prompt pair. This method, however, requires running an individual attack for each new prompt for the same image. The authors propose to generate prompt-agnostic adversarial attacks by distorting the image embedding, which will cause perturbations in the segmentation masks for a variety of prompts. The study shows that even minor perturbations are often sufficient to drastically modify the masks predicted with different types of prompts.

Publication date: 27 Nov 2023
Project Page: Not provided
Paper: https://arxiv.org/pdf/2311.14450

Post Views: 316

Press ESC to close

Share Article:

root

Universal Jailbreak Backdoors from Poisoned Human Feedback

Privacy-Preserving Algorithmic Recourse

Please allow ads on our site