Implicit neural representation and explicit 3D Gaussian Splatting (3D-GS) for novel view synthesis have achieved remarkable progress with frame-based camera (e.g. RGB and RGB-D cameras) recently. Compared to frame-based camera, a novel type of bio-inspired visual sensor, i.e. event camera, has demonstrated advantages in high temporal resolution, high dynamic range, low power consumption and low latency. Due to its unique asynchronous and irregular data capturing process, limited work has been proposed to apply neural representation or 3D Gaussian splatting for an event camera. In this work, we present IncEventGS, an incremental 3D Gaussian Splatting reconstruction algorithm with a single event camera. To recover the 3D scene representation incrementally, we exploit the tracking and mapping paradigm of conventional SLAM pipelines for IncEventGS. Given the incoming event stream, the tracker firstly estimates an initial camera motion based on prior reconstructed 3D-GS scene representation. The mapper then jointly refines both the 3D scene representation and camera motion based on the previously estimated motion trajectory from the tracker. The experimental results demonstrate that IncEventGS delivers superior performance compared to prior NeRF-based methods and other related baselines, even we do not have the ground-truth camera poses. Furthermore, our method can also deliver better performance compared to state-of-the-art event visual odometry methods in terms of camera motion estimation.
隐式神经表示和显式3D高斯点(3D-GS)技术在基于帧的相机(如RGB和RGB-D相机)的新视图合成方面取得了显著进展。相比于基于帧的相机,一种新型仿生视觉传感器——事件相机,展现了在高时间分辨率、高动态范围、低功耗和低延迟方面的优势。由于其独特的异步和不规则数据捕捉过程,现有应用于事件相机的神经表示或3D高斯点技术的工作较为有限。在本研究中,我们提出了IncEventGS,这是一种利用单个事件相机的增量式3D高斯点重建算法。为了逐步恢复3D场景表示,我们在IncEventGS中采用了传统SLAM管线中的跟踪与建图范式。在接收到事件流后,跟踪器首先基于之前重建的3D-GS场景表示估计初始相机运动。然后,建图器根据跟踪器之前估计的运动轨迹,联合优化3D场景表示和相机运动。实验结果表明,即使没有真实的相机位姿数据,IncEventGS在性能上优于先前基于NeRF的方法和其他相关基准。此外,我们的方法在相机运动估计方面也优于当前最先进的事件视觉里程计方法。