Modeling dynamic, large-scale urban scenes is challenging due to their highly intricate geometric structures and unconstrained dynamics in both space and time. Prior methods often employ high-level architectural priors, separating static and dynamic elements, resulting in suboptimal capture of their synergistic interactions. To address this challenge, we present a unified representation model, called Periodic Vibration Gaussian (PVG). PVG builds upon the efficient 3D Gaussian splatting technique, originally designed for static scene representation, by introducing periodic vibration-based temporal dynamics. This innovation enables PVG to elegantly and uniformly represent the characteristics of various objects and elements in dynamic urban scenes. To enhance temporally coherent representation learning with sparse training data, we introduce a novel flow-based temporal smoothing mechanism and a position-aware adaptive control strategy. Extensive experiments on Waymo Open Dataset and KITTI benchmarks demonstrate that PVG surpasses state-of-the-art alternatives in both reconstruction and novel view synthesis for both dynamic and static scenes. Notably, PVG achieves this without relying on manually labeled object bounding boxes or expensive optical flow estimation. Moreover, PVG exhibits 50/6000-fold acceleration in training/rendering over the best alternative.
由于城市场景具有高度复杂的几何结构和在空间和时间上不受限制的动态变化,对其进行动态大规模建模是一项挑战。以往的方法常常采用高级建筑先验,将静态和动态元素分离,从而未能充分捕捉它们的协同互动。为了应对这一挑战,我们提出了一种统一的表示模型,称为周期性振动高斯(PVG)。PVG在高效的三维高斯溅射技术基础上构建,该技术最初是为静态场景表示而设计的,通过引入基于周期性振动的时间动态,PVG能够优雅且统一地表示动态城市场景中各种对象和元素的特性。为了在稀疏训练数据下增强时间上的连贯表示学习,我们引入了一种新颖的基于流的时间平滑机制和位置感知的自适应控制策略。在Waymo开放数据集和KITTI基准测试上的广泛实验表明,PVG在重建和新视角合成方面均超越了现有的最先进方法,无论是对动态场景还是静态场景。值得注意的是,PVG实现了这一点,而无需依赖手动标记的对象边界框或昂贵的光流估计。此外,PVG在训练/渲染方面比最佳替代方案快了50/6000倍。