Depression is a major global mental health concern. To address this, the paper presents a method for depression detection by integrating speech signals into Large Language Models (LLMs) using Acoustic Landmarks. Existing LLMs are limited by their reliance on textual input. The proposed approach incorporates acoustic landmarks specific to spoken word pronunciation, adding new dimensions to textual transcripts and providing insights into individual speech patterns that may indicate mental states. The approach showed promising results on the DAIC-WOZ dataset and offers a new perspective in enhancing LLMs’ ability to process speech signals.

 

Publication date: 17 Feb 2024
Project Page: N/A
Paper: https://arxiv.org/pdf/2402.13276