The IPO-LDM model is a new method for generating 360° panoramas from narrow field of view images. It uses a bi-modal latent diffusion structure that leverages both RGB and depth panoramic data during training. This structure allows the model to generate high-quality panoramas even when only normal depth-free RGB images are available during inference.

One of the key innovations of the IPO-LDM model is the introduction of progressive camera rotations during each diffusion denoising step. This technique significantly improves the model’s ability to maintain panorama wraparound consistency. The results demonstrate that the IPO-LDM model not only outperforms existing methods for RGB panorama outpainting, but it can also produce multiple and diverse well-structured results for different types of masks.

 

Publication date: Jul 6, 2023
Project Page: https://sm0kywu.github.io/ipoldm/
Paper: https://arxiv.org/pdf/2307.03177.pdf