Given a single image of a target object, image-to-3D generation aims to reconstruct its texture and geometric shape. Recent methods often utilize intermediate media, such as multi-view images or videos, to bridge the gap between input image and the 3D target, thereby guiding the generation of both shape and texture. However, inconsistencies in the generated multi-view snapshots frequently introduce noise and artifacts along object boundaries, undermining the 3D reconstruction process. To address this challenge, we leverage 3D Gaussian Splatting (3DGS) for 3D reconstruction, and explicitly integrate uncertainty-aware learning into the reconstruction process. By capturing the stochasticity between two Gaussian models, we estimate an uncertainty map, which is subsequently used for uncertainty-aware regularization to rectify the impact of inconsistencies. Specifically, we optimize both Gaussian models simultaneously, calculating the uncertainty map by evaluating the discrepancies between rendered images from identical viewpoints. Based on the uncertainty map, we apply adaptive pixel-wise loss weighting to regularize the models, reducing reconstruction intensity in high-uncertainty regions. This approach dynamically detects and mitigates conflicts in multi-view labels, leading to smoother results and effectively reducing artifacts. Extensive experiments show the effectiveness of our method in improving 3D generation quality by reducing inconsistencies and artifacts.
针对目标物体的单张图像,图像到 3D 的生成旨在重建其纹理和几何形状。近年来的方法通常利用多视图图像或视频等中间媒介,以弥合输入图像与 3D 目标之间的差距,从而指导形状和纹理的生成。然而,生成的多视图快照中的不一致性往往会在物体边界处引入噪声和伪影,削弱 3D 重建的效果。 为了解决这一问题,我们在 3D 重建中引入了 3D Gaussian Splatting (3DGS),并明确地将不确定性感知学习(uncertainty-aware learning)集成到重建过程中。通过捕捉两个高斯模型之间的随机性,我们估计出一个不确定性映射(uncertainty map),并基于此进行不确定性感知的正则化,以校正不一致性的影响。具体而言,我们同时优化两个高斯模型,通过评估从相同视点渲染的图像之间的差异来计算不确定性映射。基于该映射,我们采用自适应逐像素损失加权来正则化模型,在高不确定性区域减少重建强度。 这种方法能够动态检测并缓解多视图标注中的冲突,从而实现更平滑的结果,并有效减少伪影。大量实验表明,我们的方法在减少不一致性和伪影的同时,显著提升了 3D 生成的质量。