Controllable Music Production with Diffusion Models and Guidance Gradients

The paper by Mark Levy and others from Apple demonstrates how conditional generation from diffusion models can be applied to various tasks in music production. This includes continuation, inpainting, and regeneration of musical audio, creating smooth transitions between different music tracks, and transferring desired stylistic characteristics to existing audio clips. The approach allows for fine-grained control over the musical output and removes the need for paired data during training. The paper suggests that there’s huge potential for music production incorporating a diffusion model as a generative prior.

Publication date: 1 Nov 2023
Project Page: https://arxiv.org/abs/2311.00613
Paper: https://arxiv.org/pdf/2311.00613

Post Views: 269

Controllable Music Production with Diffusion Models and Guidance Gradients

root

Leave a Reply Cancel reply

Press ESC to close

Share Article:

root

DistilWhisper: Efficient Distillation of Multi-task Speech Models via Language-Specific Experts

Active Noise Control Portable Device Design

Leave a Reply Cancel reply

Please allow ads on our site