This technical review presents a novel audio coding approach using Mel-frequency cepstral coefficients (MFCC) in a Generative Adversarial Network (GAN). The technique combines a traditional encoder with an adversarial learning decoder, which helps reconstruct the original waveform more accurately. The new MFCC-GAN codec was compared with five well-known codecs, and it achieved state-of-the-art results in terms of Signal-to-Noise Ratio (SNR) despite having a lower bitrate. The paper also suggests adopting loss functions optimizing intelligibility and perceptual metrics in the MFCC-GAN structure for future improvements.

 

Publication date: 25 Oct 2023
Project Page: N/A
Paper: https://arxiv.org/pdf/2310.14300