My3DGen: Building Lightweight Personalized 3D Generative Model
EgoVLPv2 is a significant improvement over the previous generation of egocentric video-language pre-training (EgoVLP). It incorporates cross-modal fusion directly into the video and language backbones, learning strong video-text representation during…
Continue reading