Summon a Demon and Bind it: A Grounded Theory of LLM Red Teaming in the Wild
This paper delves into the novel human activity of attacking large language models (LLMs) to generate abnormal outputs – a practice known as ‘Red Teaming’. Interviews with practitioners from various…
Continue reading