The article introduces the PBSCSR dataset, a resource for studying composer style recognition in piano sheet music. The dataset contains 40,000 62×64 bootleg score images for a 9-way classification task, 100,000 62×64 bootleg score images for a 100-way classification task, and 29,310 unlabeled variable-length bootleg score images for pretraining. The authors envision a variety of research tasks that could be studied with the dataset, including variations of composer style recognition in a few-shot or zero-shot setting. They also provide code and baseline results for future works to compare against.
Publication date: 31 Jan 2024
Project Page: https://arxiv.org/abs/2401.16803v1
Paper: https://arxiv.org/pdf/2401.16803