This study investigates symbolic music generation, specifically generating piano rolls, with a focus on non-differentiable rule guidance. The researchers propose a novel guidance method called Stochastic Control Guidance (SCG) that only requires forward evaluation of rule functions and can work with pre-trained diffusion models in a plug-and-play way. The study also introduces a latent diffusion architecture for symbolic music generation with high time resolution. This framework shows significant advancements in music quality and rule-based controllability, outperforming current state-of-the-art generators in various settings.
Publication date: 23 Feb 2024
Project Page: Not provided
Paper: https://arxiv.org/pdf/2402.14285