We present a novel animatable 3D Gaussian model for rendering high-fidelity free-view human motions in real time. Compared to existing NeRF-based methods, the model owns better capability in synthesizing high-frequency details without the jittering problem across video frames. The core of our model is a novel augmented 3D Gaussian representation, which attaches each Gaussian with a learnable code. The learnable code serves as a pose-dependent appearance embedding for refining the erroneous appearance caused by geometric transformation of Gaussians, based on which an appearance refinement model is learned to produce residual Gaussian properties to match the appearance in target pose. To force the Gaussians to learn the foreground human only without background interference, we further design a novel alpha loss to explicitly constrain the Gaussians within the human body. We also propose to jointly optimize the human joint parameters to improve the appearance accuracy. The animatable 3D Gaussian model can be learned with shallow MLPs, so new human motions can be synthesized in real time (66 fps on avarage). Experiments show that our model has superior performance over NeRF-based methods.
我们提出了一种新颖的可动画化的三维高斯模型,用于实时渲染高保真的自由视角人体动作。与现有基于NeRF的方法相比,该模型在合成高频细节方面具有更好的能力,且在视频帧间没有抖动问题。我们模型的核心是一种新颖的增强型三维高斯表示,它为每个高斯分配了一个可学习的编码。这个可学习的编码作为姿势依赖的外观嵌入,用于修正高斯变换造成的外观错误,基于此我们学习了一个外观精炼模型,以产生残差高斯属性以匹配目标姿势中的外观。为了迫使高斯学习仅前景人体而不受背景干扰,我们进一步设计了一种新颖的alpha损失,显式地限制高斯在人体内部。我们还建议联合优化人体关节参数以提高外观准确性。这种可动画的三维高斯模型可以通过浅层MLPs学习,因此可以实时合成新的人体动作(平均66 fps)。实验表明,我们的模型在性能上优于基于NeRF的方法。