The study focuses on improving the reliability of post-hoc explanation methods in image classification. These methods are used to understand the decision-making processes of deep neural networks, which are often seen as complex ‘black boxes’. The paper proposes an approach inspired by psychometrics, using Krippendorf’s alpha to quantify the benchmark reliability of these methods. It also suggests model training modifications, such as feeding perturbed samples and employing focal loss, to enhance model robustness. The research establishes a foundation for more reliable evaluation practices in the post-hoc explanation field.

 

Publication date: 30 Nov 2023
Project Page: https://arxiv.org/abs/2311.17876v1
Paper: https://arxiv.org/pdf/2311.17876