Doduo: Learning Dense Visual Correspondence from Unsupervised Semantic-Aware Flow

The article introduces DODUO, a method to learn general dense visual correspondence from in-the-wild images and videos without ground truth supervision. Dense visual correspondence plays a critical role in robotic perception. DODUO estimates the dense flow field, encoding the displacement of each pixel in one image to its corresponding pixel in the other image. The method uses flow-based warping to acquire supervisory signals for the training. Semantic priors are incorporated with self-supervised flow training to produce accurate dense correspondence robust to dynamic changes of the scenes. DODUO demonstrates superior performance on point-level correspondence estimation over existing self-supervised correspondence learning baselines and has practical applications in robotics.

Publication date: 27 Sep 2023
Project Page: https://ut-austin-rpl.github.io/Doduo/
Paper: https://arxiv.org/pdf/2309.15110

Post Views: 312

root

Leave a Reply Cancel reply

Press ESC to close

Share Article:

root

InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition

DistillBEV: Boosting Multi-Camera 3D Object Detection with Cross-Modal Knowledge Distillation

Leave a Reply Cancel reply

Please allow ads on our site