Skip to content

Latest commit

 

History

History
7 lines (5 loc) · 2.73 KB

2412.09176.md

File metadata and controls

7 lines (5 loc) · 2.73 KB

LIVE-GS: LLM Powers Interactive VR by Enhancing Gaussian Splatting

Recently, radiance field rendering, such as 3D Gaussian Splatting (3DGS), has shown immense potential in VR content creation due to its high-quality rendering and efficient production process. However, existing physics-based interaction systems for 3DGS can only perform simple and non-realistic simulations or demand extensive user input for complex scenes, primarily due to the absence of scene understanding. In this paper, we propose LIVE-GS, a highly realistic interactive VR system powered by LLM. After object-aware GS reconstruction, we prompt GPT-4o to analyze the physical properties of objects in the scene, which are used to guide physical simulations consistent with real phenomena. We also design a GPT-assisted GS inpainting module to fill the unseen area covered by manipulative objects. To perform a precise segmentation of Gaussian kernels, we propose a feature-mask segmentation strategy. To enable rich interaction, we further propose a computationally efficient physical simulation framework through an PBD-based unified interpolation method, supporting various physical forms such as rigid body, soft body, and granular materials. Our experimental results show that with the help of LLM's understanding and enhancement of scenes, our VR system can support complex and realistic interactions without additional manual design and annotation.

近年来,辐射场渲染技术(如3D高斯投影,3D Gaussian Splatting, 3DGS)因其高质量渲染和高效生产流程,在VR内容创作中展现了巨大潜力。然而,现有基于物理的3DGS交互系统要么只能进行简单且不真实的模拟,要么在复杂场景中需要大量用户输入,主要原因在于缺乏场景理解能力。 为解决这一问题,本文提出了 LIVE-GS,一种基于大型语言模型(LLM)的高真实感交互式VR系统。通过对象感知的高斯投影重建后,我们利用 GPT-4o 对场景中物体的物理属性进行分析,这些属性用于指导与真实现象一致的物理模拟。同时,我们设计了一个由 GPT 辅助的高斯场景修补模块(GS Inpainting Module),以填补被交互物体遮挡的未见区域。为精确分割高斯核,我们提出了一种特征掩膜分割策略(Feature-Mask Segmentation Strategy)。 为了实现丰富的交互,我们进一步提出了一种基于 PBD(Position-Based Dynamics)的高效物理模拟框架,通过统一插值方法支持刚体、软体和颗粒材料等多种物理形式。实验结果表明,在 LLM 的场景理解和增强能力的帮助下,我们的VR系统无需额外的手工设计和标注,就能支持复杂且逼真的交互,大幅提升了系统的实用性和表现力。