The article presents DIALIGHT, a toolkit for developing and evaluating multilingual Task-Oriented Dialogue (TOD) systems. It’s designed to facilitate systematic evaluations and comparisons between TOD systems using Pretrained Language Models (PLMs) and those utilizing Large Language Models (LLMs). The toolkit features a user-friendly web interface and a microservice-based backend for efficiency and scalability. The evaluations reveal that while PLM fine-tuning leads to higher accuracy, LLM-based systems excel in producing diverse responses. However, LLMs face challenges in adhering to task-specific instructions and generating outputs in multiple languages, indicating areas for future research.
Publication date: 5 Jan 2024
Project Page: Not provided
Paper: https://arxiv.org/pdf/2401.02208