Accurate 3D tracking in highly deformable scenes with occlusions and shadows can facilitate new applications in robotics, augmented reality, and generative AI. However, tracking under these conditions is extremely challenging due to the ambiguity that arises with large deformations, shadows, and occlusions. We introduce MD-Splatting, an approach for simultaneous 3D tracking and novel view synthesis, using video captures of a dynamic scene from various camera poses. MD-Splatting builds on recent advances in Gaussian splatting, a method that learns the properties of a large number of Gaussians for state-of-the-art and fast novel view synthesis. MD-Splatting learns a deformation function to project a set of Gaussians with non-metric, thus canonical, properties into metric space. The deformation function uses a neural-voxel encoding and a multilayer perceptron (MLP) to infer Gaussian position, rotation, and a shadow scalar. We enforce physics-inspired regularization terms based on local rigidity, conservation of momentum, and isometry, which leads to trajectories with smaller trajectory errors. MD-Splatting achieves high-quality 3D tracking on highly deformable scenes with shadows and occlusions. Compared to state-of-the-art, we improve 3D tracking by an average of 23.9 %, while simultaneously achieving high-quality novel view synthesis. With sufficient texture such as in scene 6, MD-Splatting achieves a median tracking error of 3.39 mm on a cloth of 1 x 1 meters in size.
在高度可变形的场景中进行精确的3D跟踪,可以为机器人技术、增强现实和生成式人工智能开辟新的应用领域。然而,在这些条件下进行跟踪极具挑战性,因为大幅度的变形、阴影和遮挡会产生歧义。我们引入了MD-Splatting,这是一种同时进行3D跟踪和新视图合成的方法,使用从不同摄像机角度捕捉的动态场景视频。MD-Splatting建立在高斯喷溅的最新进展之上,这是一种学习大量高斯属性以实现最新和快速新视图合成的方法。MD-Splatting学习一个变形函数,将具有非度量(即规范)属性的一组高斯投影到度量空间中。变形函数使用神经体素编码和多层感知器(MLP)来推断高斯的位置、旋转和阴影标量。我们强制执行基于局部刚性、动量守恒和等距的物理启发式正则化项,这导致具有较小轨迹误差的轨迹。MD-Splatting在具有阴影和遮挡的高度可变形场景中实现了高质量的3D跟踪。与最新技术相比,我们将3D跟踪平均提高了23.9%,同时实现了高质量的新视图合成。在具有足够纹理的场景中,如场景6,MD-Splatting在1 x 1米大小的布料上实现了3.39毫米的中位跟踪误差。