Skip to content

Latest commit

 

History

History
5 lines (3 loc) · 2.17 KB

2403.16292.md

File metadata and controls

5 lines (3 loc) · 2.17 KB

latentSplat: Autoencoding Variational Gaussians for Fast Generalizable 3D Reconstruction

We present latentSplat, a method to predict semantic Gaussians in a 3D latent space that can be splatted and decoded by a light-weight generative 2D architecture. Existing methods for generalizable 3D reconstruction either do not enable fast inference of high resolution novel views due to slow volume rendering, or are limited to interpolation of close input views, even in simpler settings with a single central object, where 360-degree generalization is possible. In this work, we combine a regression-based approach with a generative model, moving towards both of these capabilities within the same method, trained purely on readily available real video data. The core of our method are variational 3D Gaussians, a representation that efficiently encodes varying uncertainty within a latent space consisting of 3D feature Gaussians. From these Gaussians, specific instances can be sampled and rendered via efficient Gaussian splatting and a fast, generative decoder network. We show that latentSplat outperforms previous works in reconstruction quality and generalization, while being fast and scalable to high-resolution data.

我们提出了latentSplat,一种在3D潜空间中预测语义高斯的方法,这些高斯可以被轻量级生成性2D架构喷溅并解码。现有的通用3D重建方法要么由于体积渲染速度慢而无法快速推断高分辨率新视图,要么限于对接近输入视图的插值,即使在具有单一中心对象的更简单设置中,其中360度概括是可能的。在这项工作中,我们结合了基于回归的方法和生成模型,向在同一方法内同时拥有这两种能力迈进,该方法完全基于现成的真实视频数据进行训练。我们方法的核心是变分3D高斯,这是一种有效编码潜空间中不同不确定性的表示,该潜空间由3D特征高斯组成。从这些高斯中,可以采样特定实例并通过高效的高斯喷溅和快速的生成解码器网络渲染。我们展示了latentSplat在重建质量和概括性方面超越了之前的工作,同时在处理高分辨率数据方面快速且可扩展。