Semantic understanding plays a crucial role in Dense Simultaneous Localization and Mapping (SLAM), facilitating comprehensive scene interpretation. Recent advancements that integrate Gaussian Splatting into SLAM systems have demonstrated its effectiveness in generating high-quality renderings through the use of explicit 3D Gaussian representations. Building on this progress, we propose SGS-SLAM, the first semantic dense visual SLAM system grounded in 3D Gaussians, which provides precise 3D semantic segmentation alongside high-fidelity reconstructions. Specifically, we propose to employ multi-channel optimization during the mapping process, integrating appearance, geometric, and semantic constraints with key-frame optimization to enhance reconstruction quality. Extensive experiments demonstrate that SGS-SLAM delivers state-of-the-art performance in camera pose estimation, map reconstruction, and semantic segmentation, outperforming existing methods meanwhile preserving real-time rendering ability.
语义理解在密集型同时定位与地图构建(SLAM)中扮演着至关重要的角色,它促进了对场景的全面解释。近期将高斯喷溅技术整合到SLAM系统中的进展证明了其在通过使用显式的3D高斯表示生成高质量渲染图像方面的有效性。基于这一进展,我们提出了SGS-SLAM,这是第一个基于3D高斯的语义密集视觉SLAM系统,它提供精确的3D语义分割与高保真重建。具体来说,我们提议在映射过程中采用多通道优化,整合外观、几何和语义约束与关键帧优化来提升重建质量。广泛的实验表明,SGS-SLAM在相机位姿估计、地图重建和语义分割方面提供了最先进的性能,同时保持了实时渲染能力,超越了现有方法。