Watch Your Language: Large Language Models and Content Moderation

The paper examines the use of Large Language Models (LLMs) like GPT-3 and GPT-4 in content moderation roles. It evaluates their performance in rule-based community moderation and toxic content detection. LLMs showed promising results, with a median accuracy of 64% and precision of 83% for rule-based moderation. They also outperformed existing toxicity classifiers. However, increasing the model size added only marginal benefits to toxicity detection, suggesting a potential performance plateau.

Publication date: 28 Sep 2023
Project Page: https://www.aaai.org/
Paper: https://arxiv.org/pdf/2309.14517

Post Views: 366

Content Moderation, GPT-3, GPT-4, Large Language Models, Toxicity Detection

Watch Your Language: Large Language Models and Content Moderation

root

Leave a Reply Cancel reply

Press ESC to close

Share Article:

root

Cluster-based Method for Eavesdropping Identification and Localization in Optical Links

Learning to Transform for Generalizable Instance-wise Invariance

Leave a Reply Cancel reply

Please allow ads on our site