The feed-forward based 3D Gaussian Splatting method has demonstrated exceptional capability in real-time human novel view synthesis. However, existing approaches are restricted to dense viewpoint settings, which limits their flexibility in free-viewpoint rendering across a wide range of camera view angle discrepancies. To address this limitation, we propose a real-time pipeline named EVA-Gaussian for 3D human novel view synthesis across diverse camera settings. Specifically, we first introduce an Efficient cross-View Attention (EVA) module to accurately estimate the position of each 3D Gaussian from the source images. Then, we integrate the source images with the estimated Gaussian position map to predict the attributes and feature embeddings of the 3D Gaussians. Moreover, we employ a recurrent feature refiner to correct artifacts caused by geometric errors in position estimation and enhance visual further improve synthesis quality, we incorporate a powerful anchor loss function for both 3D Gaussian attributes and human face landmarks. Experimental results on the THuman2.0 and THumansit datasets showcase the superiority of our EVA-Gaussian approach in rendering quality across diverse camera settings.
基于前馈的3D高斯散射方法在实时人类新视图合成中展现了卓越的能力。然而,现有方法局限于稠密的视点设置,限制了其在跨越大范围相机视角差异的自由视点渲染中的灵活性。为了解决这一限制,我们提出了一种名为EVA-Gaussian的实时管线,用于在不同相机设置下实现3D人类新视图合成。具体而言,我们首先引入了高效跨视角注意力(Efficient cross-View Attention, EVA)模块,以精确估计每个3D高斯的位置信息。然后,我们将源图像与估计的高斯位置图整合,以预测3D高斯的属性和特征嵌入。此外,我们采用了一个递归特征优化器,纠正由位置估计中的几何误差引起的伪影,并增强视觉逼真度。为了进一步提升合成质量,我们为3D高斯属性和人脸标志点引入了一个强大的锚点损失函数。在THuman2.0和THumansit数据集上的实验结果展示了我们EVA-Gaussian方法在不同相机设置下的渲染质量优势。