Recent advancements in learned 3D representations have enabled significant progress in solving complex robotic manipulation tasks, particularly for rigid-body objects. However, manipulating granular materials such as beans, nuts, and rice, remains challenging due to the intricate physics of particle interactions, high-dimensional and partially observable state, inability to visually track individual particles in a pile, and the computational demands of accurate dynamics prediction. Current deep latent dynamics models often struggle to generalize in granular material manipulation due to a lack of inductive biases. In this work, we propose a novel approach that learns a visual dynamics model over Gaussian splatting representations of scenes and leverages this model for manipulating granular media via Model-Predictive Control. Our method enables efficient optimization for complex manipulation tasks on piles of granular media. We evaluate our approach in both simulated and real-world settings, demonstrating its ability to solve unseen planning tasks and generalize to new environments in a zero-shot transfer. We also show significant prediction and manipulation performance improvements compared to existing granular media manipulation methods.
最近在学习型3D表示方面的进展极大推动了复杂机器人操作任务的解决,特别是对刚体物体的操作。然而,操控颗粒状材料(如豆类、坚果和大米)仍然具有挑战性,这是由于颗粒相互作用的复杂物理特性、高维且部分可观测的状态、难以在一堆颗粒中可视化跟踪单个颗粒,以及对准确动态预测的高计算要求。目前的深度潜在动态模型在颗粒材料操作中常因缺乏归纳偏差而难以泛化。在这项工作中,我们提出了一种新颖的方法,通过对场景的高斯散射表示学习视觉动态模型,并利用该模型通过模型预测控制(Model-Predictive Control)来操控颗粒介质。我们的方法能够有效优化复杂的颗粒介质堆操作任务。我们在仿真和真实环境中评估了该方法,展示了其在解决未见过的规划任务和零样本迁移至新环境中的泛化能力。与现有的颗粒介质操作方法相比,我们的方法在预测和操作性能上也有显著提升。