This is an extended implementation of matterport's Mask R-CNN implementation on Python 3, Keras, and TensorFlow that supports RGB-D input data. The model generates bounding boxes and segmentation masks for each instance of an object in the RGB(-D) image. It's based on Feature Pyramid Network (FPN) and a ResNet101 backbone.
In our experiments we showed that an additional depth input layer can improve the segmentation accuracy of Mask R-CNN by up to 31%.
Links: Presentation, Paper (unofficial CoRL 2018 submission)
Training and evaluation scripts for the 2D-3D-S, ADE20K, Coco, NYU Depth V2, sceneNet and sceneNN datasets can be found under the instance_segmentation directory:
-
dataset.py
: Hyperparameters config, interface to datasets -
train.py
: Training script -
eval.ipynb
: IPython notebook for the evaluation of the dataset & training
Please refer to Matterport's original implementation for the original documentation and more.