Scene representations using 3D Gaussian primitives have produced excellent results in modeling the appearance of static and dynamic 3D scenes. Many graphics applications, however, demand the ability to manipulate both the appearance and the physical properties of objects. We introduce Feature Splatting, an approach that unifies physics-based dynamic scene synthesis with rich semantics from vision language foundation models that are grounded by natural language. Our first contribution is a way to distill high-quality, object-centric vision-language features into 3D Gaussians, that enables semi-automatic scene decomposition using text queries. Our second contribution is a way to synthesize physics-based dynamics from an otherwise static scene using a particle-based simulator, in which material properties are assigned automatically via text queries. We ablate key techniques used in this pipeline, to illustrate the challenge and opportunities in using feature-carrying 3D Gaussians as a unified format for appearance, geometry, material properties and semantics grounded on natural language.
使用三维高斯原始体的场景表示在建模静态和动态三维场景的外观方面取得了优异的成果。然而,许多图形应用程序要求能够操纵对象的外观和物理属性。我们介绍了特征喷溅(Feature Splatting),这是一种将基于物理的动态场景合成与基于自然语言的视觉语言基础模型中的丰富语义统一起来的方法。我们的第一个贡献是一种方法,能够将高质量的、以对象为中心的视觉-语言特征提炼到三维高斯中,这使得使用文本查询进行半自动场景分解成为可能。我们的第二个贡献是一种方法,能够使用基于粒子的模拟器从本来静态的场景中合成基于物理的动力学,其中材料属性通过文本查询自动分配。我们对在此流程中使用的关键技术进行了剖析,以说明使用携带特征的三维高斯作为外观、几何、材料属性和基于自然语言的语义的统一格式所面临的挑战和机遇。