Skip to content

Latest commit

 

History

History
5 lines (3 loc) · 1.34 KB

2410.14462.md

File metadata and controls

5 lines (3 loc) · 1.34 KB

LUDVIG: Learning-free Uplifting of 2D Visual features to Gaussian Splatting scenes

We address the task of uplifting visual features or semantic masks from 2D vision models to 3D scenes represented by Gaussian Splatting. Whereas common approaches rely on iterative optimization-based procedures, we show that a simple yet effective aggregation technique yields excellent results. Applied to semantic masks from Segment Anything (SAM), our uplifting approach leads to segmentation quality comparable to the state of the art. We then extend this method to generic DINOv2 features, integrating 3D scene geometry through graph diffusion, and achieve competitive segmentation results despite DINOv2 not being trained on millions of annotated masks like SAM.

我们研究了将2D视觉模型的视觉特征或语义掩码提升到由高斯散射表示的3D场景中的任务。与常见的基于迭代优化的方法不同,我们展示了一种简单但有效的聚合技术能够产生出色的结果。应用于来自Segment Anything(SAM)的语义掩码时,我们的提升方法在分割质量上可与当前最先进的方法媲美。随后,我们将该方法扩展到通用的DINOv2特征,通过图扩散集成3D场景几何信息,尽管DINOv2没有像SAM那样在数百万标注掩码上进行训练,但仍然取得了有竞争力的分割结果。