Point-VOS: Pointing Up Video Object Segmentation

The article introduces a novel method called Point-VOS for Video Object Segmentation (VOS). Traditional VOS methods require dense per-object mask annotations which are time-consuming and costly. Point-VOS, however, utilizes a spatio-temporally sparse point-wise annotation scheme, reducing the annotation effort significantly. The authors have applied this scheme to two large-scale video datasets and propose a new Point-VOS benchmark. The study shows that existing VOS methods can be adapted to use point annotations during training and still achieve results close to fully-supervised performance. The data can also be used to improve models connecting vision and language, as demonstrated by evaluating it on the Video Narrative Grounding (VNG) task.

Publication date: 8 Feb 2024
Project Page: https://pointvos.github.io
Paper: https://arxiv.org/pdf/2402.05917

Post Views: 286

Point-VOS: Pointing Up Video Object Segmentation

root

Leave a Reply Cancel reply

Press ESC to close

Share Article:

root

Collaborative Control for Geometry-Conditioned PBR Image Generation

ClickSAM: Fine-tuning Segment Anything Model using click prompts for ultrasound image segmentation

Leave a Reply Cancel reply

Please allow ads on our site