This paper revisits the concept of distilling 2D features into 3D networks for autonomous driving applications. The authors propose a new approach for semantic segmentation that significantly improves upon previous 3D distillation methods. They demonstrate that distillation in high capacity 3D networks is crucial for achieving high-quality 3D features, and this approach significantly narrows the performance gap between unsupervised distilled 3D features and fully-supervised ones. The study also shows that these high-quality distilled representations can be used for open-vocabulary segmentation and background/foreground discovery.

 

Publication date: 26 Oct 2021
Project Page: https://arxiv.org/abs/2310.17504v1
Paper: https://arxiv.org/pdf/2310.17504