The paper discusses the application of Large Language Models (LLMs) in penetration testing, with a focus on Linux privilege escalation. The authors developed a benchmark for Linux privilege escalation and used it to evaluate the performance of different LLMs. The study aims to understand the strengths and weaknesses of these models in the context of privilege escalation, which will contribute to their improvement in the field of cybersecurity. The paper also introduces a novel Linux privilege escalation benchmark and an LLM-driven Linux privilege escalation prototype.

 

Publication date: 19 Oct 2023
Project Page: Not provided
Paper: https://arxiv.org/pdf/2310.11409