GaussianDiffusion: 3D Gaussian Splatting for Denoising Diffusion Probabilistic Models with Structured Noise
Text-to-3D, known for its efficient generation methods and expansive creative potential, has garnered significant attention in the AIGC domain. However, the amalgamation of Nerf and 2D diffusion models frequently yields oversaturated images, posing severe limitations on downstream industrial applications due to the constraints of pixelwise rendering method. Gaussian splatting has recently superseded the traditional pointwise sampling technique prevalent in NeRF-based methodologies, revolutionizing various aspects of 3D reconstruction. This paper introduces a novel text to 3D content generation framework based on Gaussian splatting, enabling fine control over image saturation through individual Gaussian sphere transparencies, thereby producing more realistic images. The challenge of achieving multi-view consistency in 3D generation significantly impedes modeling complexity and accuracy. Taking inspiration from SJC, we explore employing multi-view noise distributions to perturb images generated by 3D Gaussian splatting, aiming to rectify inconsistencies in multi-view geometry. We ingeniously devise an efficient method to generate noise that produces Gaussian noise from diverse viewpoints, all originating from a shared noise source. Furthermore, vanilla 3D Gaussian-based generation tends to trap models in local minima, causing artifacts like floaters, burrs, or proliferative elements. To mitigate these issues, we propose the variational Gaussian splatting technique to enhance the quality and stability of 3D appearance. To our knowledge, our approach represents the first comprehensive utilization of Gaussian splatting across the entire spectrum of 3D content generation processes.
文本到3D技术,以其高效的生成方法和广泛的创造潜力而在人工智能生成内容(AIGC)领域受到了显著关注。然而,Nerf和2D扩散模型的结合经常产生过饱和的图像,由于像素级渲染方法的限制,这对下游工业应用造成了严重的限制。最近,高斯溅射已经取代了在基于NeRF的方法中普遍存在的传统逐点采样技术,彻底改变了3D重建的各个方面。本文介绍了一种基于高斯溅射的新型文本到3D内容生成框架,通过控制每个高斯球体的透明度,实现对图像饱和度的精细控制,从而产生更逼真的图像。在3D生成中实现多视图一致性的挑战显著阻碍了建模的复杂性和准确性。受到SJC的启发,我们探索使用多视图噪声分布来扰动由3D高斯溅射生成的图像,旨在纠正多视图几何中的不一致性。我们巧妙地设计了一种高效的方法来生成噪声,从不同视角产生高斯噪声,所有噪声均源自共享的噪声源。此外,普通的基于3D高斯的生成倾向于使模型陷入局部最小值,导致浮游物、毛刺或过度生长等伪影。为了缓解这些问题,我们提出了变分高斯溅射技术,以提高3D外观的质量和稳定性。据我们所知,我们的方法代表了对整个3D内容生成过程中高斯溅射的首次全面利用。