The problem of novel view synthesis has grown significantly in popularity recently with the introduction of Neural Radiance Fields (NeRFs) and other implicit scene representation methods. A recent advance, 3D Gaussian Splatting (3DGS), leverages an explicit representation to achieve real-time rendering with high-quality results. However, 3DGS still requires an abundance of training views to generate a coherent scene representation. In few shot settings, similar to NeRF, 3DGS tends to overfit to training views, causing background collapse and excessive floaters, especially as the number of training views are reduced. We propose a method to enable training coherent 3DGS-based radiance fields of 360 scenes from sparse training views. We find that using naive depth priors is not sufficient and integrate depth priors with generative and explicit constraints to reduce background collapse, remove floaters, and enhance consistency from unseen viewpoints. Experiments show that our method outperforms base 3DGS by up to 30.5% and NeRF-based methods by up to 15.6% in LPIPS on the MipNeRF-360 dataset with substantially less training and inference cost.
近来,随着神经辐射场(NeRFs)和其他隐式场景表示方法的引入,新视图合成问题的受欢迎程度显著增长。最近的一个进展,3D高斯喷溅(3DGS),利用显式表示实现了高质量的实时渲染。然而,3DGS仍然需要大量的训练视图来生成一个连贯的场景表示。在少量样本设置中,与NeRF类似,3DGS倾向于过拟合训练视图,导致背景崩塌和过多的漂浮物,尤其是当训练视图数量减少时。我们提出了一种方法,使得能够从稀疏训练视图中训练出连贯的基于3DGS的辐射场,用于360度场景。我们发现,仅使用原始的深度先验是不够的,并将深度先验与生成和显式约束结合起来,以减少背景崩塌,移除漂浮物,并增强从未见过视点的一致性。实验表明,我们的方法在MipNeRF-360数据集上的LPIPS性能比基础3DGS提高了高达30.5%,比基于NeRF的方法提高了高达15.6%,同时训练和推理成本大大减少。