SpiRit-LM: Interleaved Spoken and Written Language Model
The article presents SPIRIT-LM, a foundational multimodal language model that seamlessly combines text and speech. The model builds upon a pretrained text language model, extending its capabilities to the speech…
Continue reading