Skip to content

Latest commit

 

History

History
7 lines (5 loc) · 2.83 KB

2411.15482.md

File metadata and controls

7 lines (5 loc) · 2.83 KB

SplatFlow: Self-Supervised Dynamic Gaussian Splatting in Neural Motion Flow Field for Autonomous Driving

Most existing Dynamic Gaussian Splatting methods for complex dynamic urban scenarios rely on accurate object-level supervision from expensive manual labeling, limiting their scalability in real-world applications. In this paper, we introduce SplatFlow, a Self-Supervised Dynamic Gaussian Splatting within Neural Motion Flow Fields (NMFF) to learn 4D space-time representations without requiring tracked 3D bounding boxes, enabling accurate dynamic scene reconstruction and novel view RGB, depth and flow synthesis. SplatFlow designs a unified framework to seamlessly integrate time-dependent 4D Gaussian representation within NMFF, where NMFF is a set of implicit functions to model temporal motions of both LiDAR points and Gaussians as continuous motion flow fields. Leveraging NMFF, SplatFlow effectively decomposes static background and dynamic objects, representing them with 3D and 4D Gaussian primitives, respectively. NMFF also models the status correspondences of each 4D Gaussian across time, which aggregates temporal features to enhance cross-view consistency of dynamic components. SplatFlow further improves dynamic scene identification by distilling features from 2D foundational models into 4D space-time representation. Comprehensive evaluations conducted on the Waymo Open Dataset and KITTI Dataset validate SplatFlow's state-of-the-art (SOTA) performance for both image reconstruction and novel view synthesis in dynamic urban scenarios.

目前大多数针对复杂动态城市场景的动态高斯投影(Dynamic Gaussian Splatting)方法依赖于昂贵的手动标注提供的精确目标级监督,这限制了其在实际应用中的可扩展性。在本文中,我们提出了一种名为 SplatFlow 的自监督动态高斯投影方法,通过神经运动流场(Neural Motion Flow Fields, NMFF)学习 4D 时空表示,无需依赖跟踪的 3D 边界框,从而实现了准确的动态场景重建以及新视角的 RGB、深度和流的生成。 SplatFlow 设计了一个统一框架,将时间相关的 4D 高斯表示无缝集成到 NMFF 中。其中,NMFF 是一组隐式函数,用于将 LiDAR 点和高斯表示的时间运动建模为连续的运动流场。借助 NMFF,SplatFlow 有效地分解了静态背景和动态物体,分别使用 3D 和 4D 高斯基元进行表示。NMFF 还建模了每个 4D 高斯随时间变化的状态对应关系,从而聚合时间特征,增强动态组件的跨视角一致性。 此外,SplatFlow 通过将 2D 基础模型的特征提炼到 4D 时空表示中,进一步提升了动态场景的识别能力。在 Waymo Open Dataset 和 KITTI Dataset 上进行的综合评估表明,SplatFlow 在动态城市场景中的图像重建和新视角生成任务上均达到了最先进水平(SOTA)。