Platform | Build Status |
---|---|
Ubuntu 20.04.3 |
Georgia Tech Structure-from-Motion (GTSfM) is an end-to-end SfM pipeline based on GTSAM. GTSfM was designed from the ground-up to natively support parallel computation using Dask.
For more details, please refer to our arXiv preprint.
The majority of our code is governed by an MIT license and is suitable for commercial use. However, certain implementations featured in our repo (e.g., SuperPoint, SuperGlue) are governed by a non-commercial license and may not be used commercially.
GTSfM requires no compilation, as Python wheels are provided for GTSAM. This repository includes external repositories as Git submodules –- don't forget to pull submodules with git submodule update --init --recursive
or clone with git clone --recursive https://github.com/borglab/gtsfm.git
.
To run GTSfM, first, we need to create a conda environment with the required dependencies.
On Linux, with CUDA support, run:
conda env create -f environment_linux.yml
conda activate gtsfm-v1 # you may need "source activate gtsfm-v1" depending upon your bash and conda set-up
On macOS, there is no CUDA support, so run:
conda env create -f environment_mac.yml
conda activate gtsfm-v1
Now, install gtsfm
as a module:
pip install -e .
Make sure that you can run python -c "import gtsfm; import gtsam; print('hello world')"
in python, and you are good to go!
For a quick hands-on example, check out this Colab notebook
Before running reconstruction, if you intend to use modules with pre-trained weights (e.g., SuperPoint, SuperGlue, or PatchmatchNet), first download the model weights by running:
./download_model_weights.sh
To process a dataset containing only an image directory and EXIF metadata, ensure your dataset follows this structure:
└── {DATASET_NAME}
├── images
├── image1.jpg
├── image2.jpg
├── image3.jpg
Then, run the following command:
python gtsfm/runner/run_scene_optimizer_olssonloader.py --config_name {CONFIG_NAME} --dataset_root {DATASET_ROOT} --num_workers {NUM_WORKERS}
To explore all available options and configurations, run:
python gtsfm/runner/run_scene_optimizer_olssonloader.py -h
For example, if you want to use the Deep Front-End (recommended) on the "door"
dataset, run:
python gtsfm/runner/run_scene_optimizer_olssonloader.py --dataset_root tests/data/set1_lund_door --config_name deep_front_end.yaml --num_workers 1
You can monitor the distributed computation using the Dask dashboard.
Note: The dashboard will only display activity while tasks are actively running.
Currently, we require EXIF data embedded into your images. Alternatively, you can provide:
- Ground truth intrinsics in the expected format for an Olsson dataset
- COLMAP-exported text data
To compare GTSFM output with COLMAP, use the following command:
python gtsfm/runner/run_scene_optimizer_colmaploader.py --config_name {CONFIG_NAME} --images_dir {IMAGES_DIR} --colmap_files_dirpath {COLMAP_FILES_DIRPATH} --num_workers {NUM_WORKERS} --max_frame_lookahead {MAX_FRAME_LOOKAHEAD}
where:
COLMAP_FILES_DIRPATH
is the directory containing.txt
files such ascameras.txt
,images.txt
, etc.
To visualize the reconstructed scene using Open3D, run:
python gtsfm/visualization/view_scene.py
For users who work with the same dataset repeatedly, GTSFM allows caching front-end results for faster inference.
Refer to the detailed guide:
📄 GTSFM Front-End Cacher README
For users who want to run GTSFM on a cluster of multiple machines, follow the setup instructions here:
📄 CLUSTER.md
- The output will be saved in
--output_root
, which defaults to theresults
folder in the repo root. - Poses and 3D tracks are stored in COLMAP format inside the
ba_output
subdirectory of--output_root
. - You can visualize these using the COLMAP GUI.
We provide a preprocessing script to convert the camera poses estimated by GTSfM to nerfstudio format:
python scripts/prepare_nerfstudio.py --results_path {RESULTS_DIR} --images_dir {IMAGES_DIR}
The results are stored in the nerfstudio_input subdirectory inside {RESULTS_DIR}
, which can be used directly with nerfstudio if installed:
ns-train nerfacto --data {RESULTS_DIR}/nerfstudio_input
GTSfM is designed in an extremely modular way. Each module can be swapped out with a new one, as long as it implements the API of the module's abstract base class. The code is organized as follows:
gtsfm
: source code, organized as:averaging
bundle
: bundle adjustment implementationscommon
: basic classes used through GTSFM, such asKeypoints
,Image
,SfmTrack2d
, etcdata_association
: 3d point triangulation (DLT) w/ or w/o RANSAC, from 2d point-tracksdensify
frontend
: SfM front-end code, including:detector
: keypoint detector implementations (DoG, etc)descriptor
: feature descriptor implementations (SIFT, SuperPoint etc)matcher
: descriptor matching implementations (Superglue, etc)verifier
: 2d-correspondence verifier implementations (Degensac, OA-Net, etc)cacher
: Cache implementations for different stages of the front-end.
loader
: image data loadersutils
: utility functions such as serialization routines and pose comparisons, etc
tests
: unit tests on every function and module
Contributions are always welcome! Please be aware of our contribution guidelines for this project.
If you use GTSfM, please cite our paper:
@misc{baid2023distributed,
title={Distributed Global Structure-from-Motion with a Deep Front-End},
author={Ayush Baid and John Lambert and Travis Driver and Akshay Krishnan and Hayk Stepanyan and Frank Dellaert},
year={2023},
eprint={2311.18801},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
Citing the open-source Python implementation:
@misc{GTSFM,
author = {Ayush Baid and Travis Driver and Fan Jiang and Akshay Krishnan and John Lambert
and Ren Liu and Aditya Singh and Neha Upadhyay and Aishwarya Venkataramanan
and Sushmita Warrier and Jon Womack and Jing Wu and Xiaolong Wu and Frank Dellaert},
title = { {GTSFM}: Georgia Tech Structure from Motion},
howpublished={\url{https://github.com/borglab/gtsfm}},
year = {2021}
}
Note: authors are listed in alphabetical order (by last name).