We introduce pixelSplat, a feed-forward model that learns to reconstruct 3D radiance fields parameterized by 3D Gaussian primitives from pairs of images. Our model features real-time and memory-efficient rendering for scalable training as well as fast 3D reconstruction at inference time. To overcome local minima inherent to sparse and locally supported representations, we predict a dense probability distribution over 3D and sample Gaussian means from that probability distribution. We make this sampling operation differentiable via a reparameterization trick, allowing us to back-propagate gradients through the Gaussian splatting representation. We benchmark our method on wide-baseline novel view synthesis on the real-world RealEstate10k and ACID datasets, where we outperform state-of-the-art light field transformers and accelerate rendering by 2.5 orders of magnitude while reconstructing an interpretable and editable 3D radiance field.
我们介绍了一种名为pixelSplat的前馈模型,它学会从成对图像中重建由3D高斯原语参数化的3D辐射场。我们的模型具有实时和内存高效的渲染功能,适用于可扩展训练,以及在推理时快速的3D重建。为了克服稀疏和局部支持表征固有的局部最小值问题,我们预测3D上的密集概率分布,并从该概率分布中采样高斯均值。我们通过重新参数化技巧使这种采样操作可微分,从而能够通过高斯飞溅表示反向传播梯度。我们在真实世界的RealEstate10k和ACID数据集上对我们的方法进行了基准测试,在广基线新视角合成方面,我们的性能超越了最先进的光场变换器,并在重建可解释和可编辑的3D辐射场时加速了渲染速度达2.5个数量级。