The article presents a comprehensive benchmarking of different selective classification frameworks that are based on deep neural networks. Selective classification allows a model to abstain from making a prediction when the risk of error is high, improving the reliability and trustworthiness of machine learning models. The researchers evaluated 18 different frameworks using several criteria across a diverse set of 44 datasets, including image and tabular data, and a mix of binary and multiclass tasks. The results showed no single clear winner, suggesting that the best method depends on the user’s objectives.

 

Publication date: 23 Jan 2024
Project Page: https://arxiv.org/abs/2401.12708
Paper: https://arxiv.org/pdf/2401.12708