This academic paper discusses an improved version of the Independent Low-Rank Matrix Analysis (ILRMA) method for blind source separation (BSS) in audio and speech signals. The authors argue the current ILRMA algorithm doesn’t consider the dependency between spectral coefficients from different frequency bands. To address this, they introduce the Sinkhorn divergence to optimize the source model parameters. While this increases the BSS performance, it also significantly raises the number of parameters to be estimated and the computational complexity. To manage this, the authors propose using the Kronecker product to decompose the modeling matrix into smaller matrices, thereby reducing the algorithm’s complexity by an order of magnitude.

 

Publication date: 3 Jan 2024
Project Page: https://arxiv.org/abs/2401.01762v1
Paper: https://arxiv.org/pdf/2401.01762