LEGO:Language Enhanced Multi-modal Grounding Model
The LEGO model is a multi-modal model that emphasizes both global and local information across different modalities. Unlike existing models, which focus mainly on global information, the LEGO model can…
Continue reading