The paper presents Function as a String Encoded Representation (FASER), a novel method for cross-architecture binary code similarity search. This technique is particularly useful for analyzing malware, securing software supply chains, and conducting vulnerability research. FASER uses intermediate representations and long document transformers, eliminating the need for manual feature engineering, pre-training, or dynamic analysis. The study demonstrates that FASER outperforms existing baseline approaches in general function search tasks and targeted vulnerability search tasks.

 

Publication date: 6 Oct 2023
Project Page: https://arxiv.org/abs/2310.03605v1
Paper: https://arxiv.org/pdf/2310.03605