SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen LLMs
The Semantic Pyramid AutoEncoder (SPAE) is a groundbreaking tool that allows frozen Large Language Models (LLMs) to perform tasks involving non-linguistic modalities, such as images or videos. It does this…
Continue reading