Most 3D Gaussian Splatting (3D-GS) based methods for urban scenes initialize 3D Gaussians directly with 3D LiDAR points, which not only underutilizes LiDAR data capabilities but also overlooks the potential advantages of fusing LiDAR with camera data. In this paper, we design a novel tightly coupled LiDAR-Camera Gaussian Splatting (TCLC-GS) to fully leverage the combined strengths of both LiDAR and camera sensors, enabling rapid, high-quality 3D reconstruction and novel view RGB/depth synthesis. TCLC-GS designs a hybrid explicit (colorized 3D mesh) and implicit (hierarchical octree feature) 3D representation derived from LiDAR-camera data, to enrich the properties of 3D Gaussians for splatting. 3D Gaussian's properties are not only initialized in alignment with the 3D mesh which provides more completed 3D shape and color information, but are also endowed with broader contextual information through retrieved octree implicit features. During the Gaussian Splatting optimization process, the 3D mesh offers dense depth information as supervision, which enhances the training process by learning of a robust geometry. Comprehensive evaluations conducted on the Waymo Open Dataset and nuScenes Dataset validate our method's state-of-the-art (SOTA) performance. Utilizing a single NVIDIA RTX 3090 Ti, our method demonstrates fast training and achieves real-time RGB and depth rendering at 90 FPS in resolution of 1920x1280 (Waymo), and 120 FPS in resolution of 1600x900 (nuScenes) in urban scenarios.
大多数基于三维高斯喷溅(3D-GS)的城市场景方法直接使用3D激光雷达点初始化三维高斯,这不仅没有充分利用激光雷达数据的能力,也忽视了融合激光雷达与相机数据的潜在优势。在本文中,我们设计了一种新颖的紧密耦合激光雷达-相机高斯喷溅(TCLC-GS),充分利用了激光雷达和相机传感器的综合优势,实现快速、高质量的三维重建和新视角RGB/深度合成。TCLC-GS设计了一种从激光雷达-相机数据派生的混合显式(着色的三维网格)和隐式(层次化八叉树特征)三维表示,以丰富用于喷溅的三维高斯的属性。三维高斯的属性不仅与提供更完整的三维形状和颜色信息的三维网格对齐初始化,而且还通过检索的八叉树隐式特征赋予了更广泛的上下文信息。在高斯喷溅优化过程中,三维网格提供密集的深度信息作为监督,这通过学习鲁棒的几何形状增强了训练过程。在Waymo Open数据集和nuScenes数据集上进行的全面评估验证了我们方法的最先进(SOTA)性能。使用单个NVIDIA RTX 3090 Ti,我们的方法展示了快速训练并在城市场景中实现了实时RGB和深度渲染,分辨率为1920x1280(Waymo)下达到90 FPS,以及分辨率为1600x900(nuScenes)下达到120 FPS。