Watch Your Language: Large Language Models and Content Moderation

The paper examines the use of Large Language Models (LLMs) like GPT-3 and GPT-4 in content moderation roles. It evaluates their performance in rule-based community moderation and toxic content detection. LLMs showed promising results, with a median accuracy of 64% and precision of 83% for rule-based moderation. They also outperformed existing toxicity classifiers. However, increasing the model size added only marginal benefits to toxicity detection, suggesting a potential performance plateau.

Publication date: 28 Sep 2023
Project Page: https://www.aaai.org/
Paper: https://arxiv.org/pdf/2309.14517

Post Views: 302

root

Exit mobile version

Please allow ads on our site

Looks like you're using an ad blocker. Please support us by disabling these ad blocker.

Press ESC to close

Share Article:

root

Cluster-based Method for Eavesdropping Identification and Localization in Optical Links

Learning to Transform for Generalizable Instance-wise Invariance

Please allow ads on our site