Language-conditioned Detection Transformer
This paper introduces DECOLA, a new open-vocabulary detection framework that uses both image-level labels and detailed detection annotations. The framework works in three steps: training a language-conditioned object detector, pseudo-labeling…
Continue reading