The article presents Sketch2NeRF, a multi-view sketch-guided text-to-3D generation framework. Unlike previous text-to-3D approaches that lack fine-grained control, Sketch2NeRF utilizes sketches to offer such control. The method uses pretrained 2D diffusion models to optimize a 3D scene represented by a neural radiance field (NeRF). The authors propose a synchronized generation and reconstruction method to optimize the NeRF. The method was tested on two kinds of multi-view sketch datasets, demonstrating its ability to synthesize 3D content with fine-grained sketch control while being high-fidelity to text prompts. The results show that Sketch2NeRF achieves state-of-the-art performance in terms of sketch similarity and text alignment.
Publication date: 25 Jan 2022
Project Page: https://arxiv.org/pdf/2401.14257v1.pdf
Paper: https://arxiv.org/pdf/2401.14257