This article discusses the development of SPiC E, a neural network that adds structural guidance to 3D diffusion models. The models can now train on million-scale 3D datasets, producing high-quality text-conditional 3D samples within seconds. However, they require time-consuming optimization procedures for each sample synthesis. SPiC E addresses this by introducing a cross-entity attention mechanism allowing multiple entities to interact within the denoising network. This results in learning task-specific structural priors from auxiliary guidance shapes. The approach supports applications like 3D stylization, semantic shape editing, and text-conditional abstraction-to-3D, transforming primitive-based abstractions into expressive shapes.

 

Publication date: 29 Nov 2023
Project Page: https://tau-vailab.github.io/spic-e
Paper: https://arxiv.org/pdf/2311.17834