The article presents MalDICT, a collection of four benchmark datasets that support different, under-represented malware classification tasks. Malware can be classified according to various attributes, and the ability to identify these attributes in newly-emerging malware using machine learning could provide significant value to analysts. The behavior of malware, platforms that malware runs on, vulnerabilities that malware exploit, and packers that malware are packed with are all considered. The authors hope that the release of these datasets will encourage further research and awareness of these tasks.
Publication date: 18 Oct 2023
Project Page: http://ceur-ws.org
Paper: https://arxiv.org/pdf/2310.11706