Skip to content

Latest commit

 

History

History
7 lines (5 loc) · 2.83 KB

2411.12789.md

File metadata and controls

7 lines (5 loc) · 2.83 KB

Automated 3D Physical Simulation of Open-world Scene with Gaussian Splatting

Recent advancements in 3D generation models have opened new possibilities for simulating dynamic 3D object movements and customizing behaviors, yet creating this content remains challenging. Current methods often require manual assignment of precise physical properties for simulations or rely on video generation models to predict them, which is computationally intensive. In this paper, we rethink the usage of multi-modal large language model (MLLM) in physics-based simulation, and present Sim Anything, a physics-based approach that endows static 3D objects with interactive dynamics. We begin with detailed scene reconstruction and object-level 3D open-vocabulary segmentation, progressing to multi-view image in-painting. Inspired by human visual reasoning, we propose MLLM-based Physical Property Perception (MLLM-P3) to predict mean physical properties of objects in a zero-shot manner. Based on the mean values and the object's geometry, the Material Property Distribution Prediction model (MPDP) model then estimates the full distribution, reformulating the problem as probability distribution estimation to reduce computational costs. Finally, we simulate objects in an open-world scene with particles sampled via the Physical-Geometric Adaptive Sampling (PGAS) strategy, efficiently capturing complex deformations and significantly reducing computational costs. Extensive experiments and user studies demonstrate our Sim Anything achieves more realistic motion than state-of-the-art methods within 2 minutes on a single GPU.

近期3D生成模型的进展为模拟动态3D对象运动和定制行为提供了新的可能性,但生成此类内容依然具有挑战性。现有方法通常需要手动指定精确的物理属性进行模拟,或者依赖视频生成模型进行预测,这对计算资源要求较高。 本文重新思考了多模态大语言模型(MLLM)在基于物理模拟中的应用,提出了 Sim Anything,一种赋予静态3D对象交互动态的物理模拟方法。我们从详细的场景重建和对象级 3D 开放词汇分割开始,逐步实现多视角图像修补。受人类视觉推理的启发,我们设计了 MLLM-based Physical Property Perception (MLLM-P3),以零样本方式预测对象的平均物理属性。基于平均值和对象几何信息,Material Property Distribution Prediction (MPDP) 模型进一步估计完整分布,将问题重构为概率分布估计,从而显著降低计算成本。 最后,我们利用 Physical-Geometric Adaptive Sampling (PGAS) 策略在开放世界场景中对对象进行模拟,通过采样粒子高效捕捉复杂变形,并显著减少计算成本。大量实验和用户研究表明,Sim Anything 能够在单张 GPU 上于 2 分钟内 生成比现有最先进方法更真实的运动效果。