3D Gaussian Splatting (3DGS) has shown convincing performance in rendering speed and fidelity, yet the generation of Gaussian Splatting remains a challenge due to its discreteness and unstructural nature. In this work, we propose DiffGS, a general Gaussian generator based on latent diffusion models. DiffGS is a powerful and efficient 3D generative model which is capable of generating Gaussian primitives at arbitrary numbers for high-fidelity rendering with rasterization. The key insight is to disentangledly represent Gaussian Splatting via three novel functions to model Gaussian probabilities, colors and transforms. Through the novel disentangling of 3DGS, we represent the discrete and unstructual 3DGS with continuous Gaussian Splatting functions, where we then train a latent diffusion model with the target of generating these Gaussian Splatting functions both unconditionally and conditionally. Meanwhile, we introduce a discretization algorithm to extract Gaussians at arbitrary numbers from the generated functions via octree-guided sampling and optimization. We explore DiffGS for various tasks, including unconditional generation, conditional generation from text, image, and partial 3DGS, as well as Point-to-Gaussian generation. We believe that DiffGS provides a new direction for flexibly modeling and generating Gaussian Splatting.
Overview of DiffGS. (a) We disentangle the fitted 3DGS into three Gaussian Splatting Functions to model the Gaussian probability, colors and transforms. We then train a Gaussian VAE with a conditional latent diffusion model for generating these functions. (b) During generation, we first extract Gaussian geometry from the generated GauPF, followed by the GauCF and GauTF to obtain the Gaussian attributes.
@inproceedings{DiffGS,
title = {DiffGS: Functional Gaussian Splatting Diffusion},
author = {Zhou, Junsheng and Zhang, Weiqi and Liu, Yu-Shen},
booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},
year = {2024}
}