DJCM: A Deep Joint Cascade Model for Singing Voice Separation and Vocal Pitch Estimation

The paper proposes a DeepJointCascade Model (DJCM) for singing voice separation and vocal pitch estimation tasks in music information retrieval. Traditional methods, classified into pipeline methods and naive joint learning methods, have limitations. DJCM uses a joint cascade model structure to train both tasks concurrently, aligning different objectives with task-specific weights. Experimental results show DJCM provides state-of-the-art performance with significant improvements in Signal-to-Distortion Ratio (SDR) and Overall Accuracy (OA). The model’s code is accessible online.

Publication date: 11 Jan 2024
Project Page: https://github.com/Dream-High/DJCM
Paper: https://arxiv.org/pdf/2401.03856

Post Views: 298

DJCM: A Deep Joint Cascade Model for Singing Voice Separation and Vocal Pitch Estimation

root

Leave a Reply Cancel reply

Press ESC to close

Share Article:

root

Using perceptive subbands analysis to perform audio scenes cartography

An audio-quality-based multi-strategy approach for target speaker extraction in the MISP 2023 Challenge

Leave a Reply Cancel reply

Please allow ads on our site