FFT: Towards Harmlessness Evaluation and Analysis for LLMs with Factuality, Fairness, Toxicity

The study discusses the potential harms posed by AI-generated texts, especially from factoid, unfair, and toxic content. The authors propose FFT, a new benchmark with 2116 instances, for evaluating the harmlessness of Large Language Models (LLMs) in terms of factuality, fairness, and toxicity. The research evaluates 9 representative LLMs covering various parameter scales, training stages, and creators. Findings reveal that the harmlessness of LLMs is still under-satisfactory, prompting further research into harmless LLMs.

Publication date: 30 Nov 2023
Project Page: https://arxiv.org/abs/2311.18580v1
Paper: https://arxiv.org/pdf/2311.18580

Post Views: 227

3D Generative Models, AI fairness, artificial intelligence, factuality, Large Language Models, Toxicity Detection

FFT: Towards Harmlessness Evaluation and Analysis for LLMs with Factuality, Fairness, Toxicity

root

Leave a Reply Cancel reply

Press ESC to close

Share Article:

root

ArthModel: Enhance Arithmetic Skills to Large Language Model

Grammatical Gender’s Influence on Distributional Semantics: A Causal Perspective

Leave a Reply Cancel reply

Please allow ads on our site