Detect Every Thing with Few Examples
The article presents DE-ViT, a new open-set object detector that uses DINOv2 vision-only backbones. Instead of using language, DE-ViT learns new categories through example images. The authors transform multi-classification tasks…
Continue reading