Despite the significant strides in conversational AI, the lack of diversity and comprehensiveness in existing dialogue dataset collections poses a challenge. To address this, a team from Salesforce AI and Columbia University introduces DialogStudio: a unified, richly diverse dialogue dataset collection. It aims to enhance dialogue research and model training by including data from open-domain dialogues, task-oriented dialogues, natural language understanding, conversational recommendation, dialogue summarization, and knowledge-grounded dialogues. This broad-ranging data allows for versatile analysis and the development of adaptable models. Furthermore, the researchers have developed conversational AI models using DialogStudio, achieving superior results in both zero-shot and few-shot learning scenarios.

 

Publication date: 19 July 2023
Project Page: https://github.com/salesforce/DialogStudio
Paper: https://arxiv.org/pdf/2307.10172.pdf