Linfei Li · Lin Zhang* · Zhong Wang · Ying Shen
Table of Contents
The simplest way to install all dependences is to use anaconda and pip in the following steps:
conda create -n gs3lam python==3.10
conda activate gs3lam
conda install -c "nvidia/label/cuda-11.7.0" cuda-toolkit
pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 --extra-index-url https://download.pytorch.org/whl/cu117
pip install -r requirements.txt
# install Gaussian Rasterization
pip install submodules/gaussian-semantic-rasterization
DATAROOT is ./data
by default. Please change the basedir
path in the scene-specific config files if datasets are stored somewhere else on your machine.
The original Replica dataset does not contain semantic labels. We obtained semantic labels from vMAP. You can download our generated semantic Replica dataset from here, then place the data into the ./data/Replica
folder.
Note, if you directly use the Replica dataset provided by vMAP, please modify the Replica Dataloader and the png_depth_scale parameter in config files.
The TUM-RGBD dataset does not have ground truth semantic labels, so it is not our evaluation dataset. However, in order to evalute the effectiveness of GS3LAM, we use pseudo-semantic labels generated by DEVA, which you can download from here. Unfortunately, existing semantic segmentation models struggle to maintain inter-frame semantic consistency in long sequence data, so we only tested on the freiburg1_desk
sequence.
Please follow the data downloading procedure on the ScanNet website, and extract color/depth frames from the .sens
file using this code.
[Directory structure of ScanNet (click to expand)]
DATAROOT
└── scannet
└── scene0000_00
└── frames
├── color
│ ├── 0.jpg
│ ├── 1.jpg
│ └── ...
├── depth
│ ├── 0.png
│ ├── 1.png
│ └── ...
├── label-filt
│ ├── 0.png
│ ├── 1.png
│ └── ...
├── intrinsic
└── pose
├── 0.txt
├── 1.txt
└── ...
We use the following sequences:
scene0000_00
scene0059_00
scene0106_00
scene0169_00
scene0181_00
scene0207_00
To run GS3LAM on the freiburg1_desk
scene, run the following command:
python run.py configs/Tum/tum_fr1.py
To run GS3LAM on the office0
scene, run the following command:
python run.py configs/Replica/office0.py
To run GS3LAM on all Replica scenes, run the following command:
bash scripts/eval_full_replica.sh
To run GS3LAM on the scene0059_00
scene, run the following command:
python run.py configs/Scannet/scene0059_00.py
To run GS3LAM on all ScanNet scenes, run the following command:
bash scripts/eval_full_scannet.bash
- Define the
SEED
andSCENE_NUM
environment variables in the configuration file.
# ``SEED`` is the random seed used during training, which should be consistent with the configuration.
export SEED=1
# ``SCENE_NUM`` is the index of the data sequence in the following list.
# Replica: ["room0", "room1", "room2","office0", "office1", "office2", "office3", "office4"]
# Scannet: ["scene0059_00", "scene0106_00", "scene0169_00", "scene0181_00", "scene0207_00", "scene0000_00"]
export SCENE_NUM=0
- Online reconstruction.
# optional mode: [color, depth, centers, sem, sem_color, sem_feature]
python visualizer/online_recon.py --mode color --logdir path/to/the/log
- Offline reconstruction.
# optional mode: [color, depth, centers, sem, sem_color, sem_feature]
python visualizer/offline_recon.py --mode sem_color --logdir path/to/the/log
- Export Mesh
# optional mode: [color, sem]
python visualizer/export_mesh.py --mode color --logdir path/to/the/log
To draw Fig. 2
in the paper, which can demonstrate the relationship between optimization iterations, rendering quality and camera trajectories.
python visualizer/plot_opt_bias.py --logdir path/to/the/log
We thank the authors of the following repositories for their open-source code:
As our work heavily relies on SplaTAM, we kindly ask that you adhere to the guidelines set forth in SplaTAM's LICENSE.
If you find our paper and code useful for your research, please use the following BibTeX entry.
@inproceedings{li2024gs3lam,
author = {Li, Linfei and Zhang, Lin and Wang, Zhong and Shen, Ying},
title = {GS3LAM: Gaussian Semantic Splatting SLAM},
year = {2024},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
booktitle = {Proceedings of the 32nd ACM International Conference on Multimedia},
pages = {3019–3027},
numpages = {9},
location = {Melbourne VIC, Australia},
series = {MM '24}
}