This academic paper focuses on the development of a deepfake detection technique that can distinguish between real and fake images. The authors propose a novel approach that utilizes Vision-Language Models (VLMs) and prompt tuning techniques to enhance deepfake detection accuracy on unseen data. The technique treats deepfake detection as a visual question answering problem, where soft prompts are tuned for InstructBLIP to determine whether a query image is real or fake. The paper reports significant improvements in deepfake detection accuracy using pretrained vision-language models with prompt tuning. The proposed solution is also cost-effective, requiring fewer trainable parameters.
Publication date: 26 Oct 2023
Project Page: https://github.com/nctu-eva-lab/AntifakePrompt
Paper: https://arxiv.org/pdf/2310.17419