We present, GauHuman, a 3D human model with Gaussian Splatting for both fast training (1 ~ 2 minutes) and real-time rendering (up to 189 FPS), compared with existing NeRF-based implicit representation modelling frameworks demanding hours of training and seconds of rendering per frame. Specifically, GauHuman encodes Gaussian Splatting in the canonical space and transforms 3D Gaussians from canonical space to posed space with linear blend skinning (LBS), in which effective pose and LBS refinement modules are designed to learn fine details of 3D humans under negligible computational cost. Moreover, to enable fast optimization of GauHuman, we initialize and prune 3D Gaussians with 3D human prior, while splitting/cloning via KL divergence guidance, along with a novel merge operation for further speeding up. Extensive experiments on ZJU_Mocap and MonoCap datasets demonstrate that GauHuman achieves state-of-the-art performance quantitatively and qualitatively with fast training and real-time rendering speed. Notably, without sacrificing rendering quality, GauHuman can fast model the 3D human performer with ~13k 3D Gaussians.
我们介绍了GauHuman,一种具有高斯投影的三维人体模型,相比现有基于NeRF的隐式表示建模框架,GauHuman不仅训练速度快(1~2分钟),而且能实时渲染(最高达189 FPS),现有模型需要几小时的训练时间和每帧几秒的渲染时间。具体来说,GauHuman在典型空间内编码高斯投影,并通过线性混合皮肤(LBS)将三维高斯从典型空间转换到姿态空间,在此过程中,我们设计了有效的姿态和LBS精炼模块,以微不足道的计算成本学习三维人体的细节。此外,为了快速优化GauHuman,我们使用三维人体先验初始化和修剪三维高斯,同时通过KL散度指导进行分裂/克隆,并引入一种新的合并操作以进一步加速。在ZJU_Mocap和MonoCap数据集上的广泛实验表明,GauHuman在快速训练和实时渲染速度方面定量和定性上均达到了最先进的性能。值得注意的是,GauHuman能够快速建模三维人体表演者,使用约13k个三维高斯,而不牺牲渲染质量。