The article examines the use of Variational Autoencoders (VAEs) to create latent representations of tonal music. The authors trained the VAEs on a corpus of 371 Bach’s chorales and compared the latent space of various encodings including Piano roll, MIDI, ABC, Tonnetz, DFT of pitch, and pitch class distributions. The ABC encoding performed the best in reconstructing the original data, while the Pitch DFT captured more information from the latent space. This research contributes to music cognition by providing a pitch space for key relations that align with cognitive distances.
Publication date: 7 Nov 2023
Project Page: https://arxiv.org/abs/2311.03621v1
Paper: https://arxiv.org/pdf/2311.03621