An Efficient General-Purpose Modular Vision Model via Multi-Task Heterogeneous Training
The model presented in this paper is designed to address the challenges of multi-task learning in the field of computer vision. It is built upon a mixture-of-experts (MoE) vision transformer…
Continue reading