December 1, 2023

Automatic Construction of a Korean Toxic Instruction Dataset for Ethical Tuning of Large Language Models

This study introduces KoTox, a Korean Toxic instruction dataset to improve the ethical robustness of Large Language Models (LLMs). The dataset comprises 39K unethical instruction-output pairs and focuses on three areas: Political bias, Crime, and Hate. The researchers used lists of derogatory terms, biased expressions, and a diverse set of predicates to generate the dataset. The aim is to aid LLMs in effectively responding to toxic queries, thereby promoting secure and responsible interactions in Natural Language Processing (NLP) applications.

Publication date: 30 Nov 2023
Project Page: https://arxiv.org/abs/2311.18215v1
Paper: https://arxiv.org/pdf/2311.18215

Post Views: 319

Ethical Tuning, Korean Toxic instruction dataset, KoTox, Large Language Models, Natural Language Processing

Automatic Construction of a Korean Toxic Instruction Dataset for Ethical Tuning of Large Language Models

root

Leave a Reply Cancel reply

Press ESC to close

Share Article:

root

LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models

INarIG: Iterative Non-autoregressive Instruct Generation Model For Word-Level Auto Completion

Leave a Reply Cancel reply

Please allow ads on our site