This paper introduces a method to enhance Interactive Imitation Learning (IIL) by extracting touch interaction points and tracking object movement from video demonstrations. The approach extends current IIL systems by providing robots with detailed knowledge of both where and how to interact with objects, particularly complex articulated ones like doors and drawers. By leveraging cutting-edge techniques such as 3D Gaussian Splatting and FoundationPose for tracking, this method allows robots to better understand and manipulate objects in dynamic environments. The research lays the foundation for more effective task learning and execution in autonomous robotic systems.
本文提出了一种方法,通过从视频演示中提取触摸交互点并跟踪物体运动,来增强交互式模仿学习(Interactive Imitation Learning, IIL)。该方法扩展了现有的 IIL 系统,使机器人能够详细了解如何以及在何处与物体交互,尤其是复杂的关节类物体,如门和抽屉。通过利用先进技术,例如用于跟踪的 3D Gaussian Splatting 和 FoundationPose,该方法使机器人能够更好地理解和操作动态环境中的物体。这项研究为自主机器人系统中更高效的任务学习与执行奠定了基础。