The article discusses the use of Denoising Diffusion Probabilistic Models (DDPM) in the generation of symbolic music with desired composer styles. The authors propose the combination of a vector quantized variational autoencoder (VQ-V AE) and discrete diffusion models. The VQ-V AE can represent symbolic music as a sequence of indexes corresponding to specific entries in a learned codebook. Then, a discrete diffusion model is used to model the VQ-V AE’s discrete latent space. The trained model can generate intermediate music sequences consisting of codebook indexes, which are then decoded to symbolic music using the VQ-V AE’s decoder. The model demonstrated high accuracy in generating symbolic music in target composer styles.

 

Publication date: 25 Oct 2023
Project Page: URL will be provided here
Paper: https://arxiv.org/pdf/2310.14044