Skip to content

Latest commit

 

History

History
5 lines (3 loc) · 2.8 KB

2312.15059.md

File metadata and controls

5 lines (3 loc) · 2.8 KB

Deformable 3D Gaussian Splatting for Animatable Human Avatars

Recent advances in neural radiance fields enable novel view synthesis of photo-realistic images in dynamic settings, which can be applied to scenarios with human animation. Commonly used implicit backbones to establish accurate models, however, require many input views and additional annotations such as human masks, UV maps and depth maps. In this work, we propose ParDy-Human (Parameterized Dynamic Human Avatar), a fully explicit approach to construct a digital avatar from as little as a single monocular sequence. ParDy-Human introduces parameter-driven dynamics into 3D Gaussian Splatting where 3D Gaussians are deformed by a human pose model to animate the avatar. Our method is composed of two parts: A first module that deforms canonical 3D Gaussians according to SMPL vertices and a consecutive module that further takes their designed joint encodings and predicts per Gaussian deformations to deal with dynamics beyond SMPL vertex deformations. Images are then synthesized by a rasterizer. ParDy-Human constitutes an explicit model for realistic dynamic human avatars which requires significantly fewer training views and images. Our avatars learning is free of additional annotations such as masks and can be trained with variable backgrounds while inferring full-resolution images efficiently even on consumer hardware. We provide experimental evidence to show that ParDy-Human outperforms state-of-the-art methods on ZJU-MoCap and THUman4.0 datasets both quantitatively and visually.

近期在神经辐射场方面的进展使得在动态设置中合成逼真图像的新视角成为可能,这可以应用于包含人类动画的场景。通常用于建立精确模型的隐式主干网络,然而,需要多个输入视图和额外的标注,如人类遮罩、UV映射和深度图。在这项工作中,我们提出了ParDy-Human(参数化动态人类化身),这是一种完全显式的方法,用来从单一的单眼序列构建数字化身。ParDy-Human将参数驱动的动态引入到3D高斯涂抹中,其中3D高斯通过人体姿态模型变形以动画化身。我们的方法由两部分组成:第一个模块根据SMPL顶点变形典型的3D高斯;接着的模块进一步采用它们设计的关节编码,并预测每个高斯的变形,以处理超出SMPL顶点变形的动态。然后通过光栅器合成图像。ParDy-Human构成了一个对于逼真动态人类化身的显式模型,它需要显著更少的训练视图和图像。我们的化身学习不需要额外的标注,如遮罩,并且可以在可变背景下进行训练,同时即使在消费级硬件上也能高效地推断出全分辨率图像。我们提供实验证据表明,ParDy-Human在ZJU-MoCap和THUman4.0数据集上在定量和视觉上都优于最先进的方法。