The article presents a new method for video object segmentation (VOS) that uses sketches as references, rather than photo masks or language expressions. The authors argue that sketches are a more effective and annotation-efficient method for VOS. They provide a benchmark that includes three datasets: Sketch-DAVIS16, Sketch-DAVIS17, and Sketch-YouTube-VOS. The effectiveness of the sketch-based VOS is evaluated using STCN, a popular baseline for semi-supervised VOS tasks. The datasets are available on the project’s GitHub page.

 

Publication date: 13 Nov 2023
Project Page: https://github.com/YRlin-12/Sketch-VOS-datasets
Paper: https://arxiv.org/pdf/2311.07261