Skip to content

Latest commit

 

History

History
5 lines (3 loc) · 3.01 KB

2410.17505.md

File metadata and controls

5 lines (3 loc) · 3.01 KB

PLGS: Robust Panoptic Lifting with 3D Gaussian Splatting

Previous methods utilize the Neural Radiance Field (NeRF) for panoptic lifting, while their training and rendering speed are unsatisfactory. In contrast, 3D Gaussian Splatting (3DGS) has emerged as a prominent technique due to its rapid training and rendering speed. However, unlike NeRF, the conventional 3DGS may not satisfy the basic smoothness assumption as it does not rely on any parameterized structures to render (e.g., MLPs). Consequently, the conventional 3DGS is, in nature, more susceptible to noisy 2D mask supervision. In this paper, we propose a new method called PLGS that enables 3DGS to generate consistent panoptic segmentation masks from noisy 2D segmentation masks while maintaining superior efficiency compared to NeRF-based methods. Specifically, we build a panoptic-aware structured 3D Gaussian model to introduce smoothness and design effective noise reduction strategies. For the semantic field, instead of initialization with structure from motion, we construct reliable semantic anchor points to initialize the 3D Gaussians. We then use these anchor points as smooth regularization during training. Additionally, we present a self-training approach using pseudo labels generated by merging the rendered masks with the noisy masks to enhance the robustness of PLGS. For the instance field, we project the 2D instance masks into 3D space and match them with oriented bounding boxes to generate cross-view consistent instance masks for supervision. Experiments on various benchmarks demonstrate that our method outperforms previous state-of-the-art methods in terms of both segmentation quality and speed.

以往的方法使用神经辐射场 (NeRF) 进行全景提升,但其训练和渲染速度不尽人意。相比之下,三维高斯喷涂 (3D Gaussian Splatting, 3DGS) 凭借快速的训练和渲染速度成为了一种显著技术。然而,与 NeRF 不同,传统的 3DGS 因不依赖任何参数化结构(如 MLPs)进行渲染,可能无法满足基本的平滑性假设。因此,传统的 3DGS 更容易受到噪声二维掩码监督的影响。本文提出了一种新的方法,称为 PLGS,该方法使 3DGS 能够在保持相较于基于 NeRF 方法的高效性的同时,从噪声二维分割掩码生成一致的全景分割掩码。具体来说,我们构建了一个全景感知的结构化 3D 高斯模型以引入平滑性,并设计了有效的噪声消减策略。在语义场景构建中,我们不采用基于运动结构初始化,而是构建可靠的语义锚点来初始化 3D 高斯,并在训练过程中将这些锚点作为平滑正则化。此外,我们提出了一种自训练方法,通过合并渲染掩码和噪声掩码生成伪标签,增强 PLGS 的鲁棒性。在实例场景构建中,我们将二维实例掩码投影到三维空间,并与定向包围盒匹配,以生成跨视角一致的实例掩码用于监督。各种基准测试实验表明,我们的方法在分割质量和速度方面均优于现有的最新方法。