This page walks through the steps required to run DeepLab on ADE20K dataset on a local machine.
We have prepared the script (under the folder datasets
) to download and
convert ADE20K semantic segmentation dataset to TFRecord.
# From the tensorflow/models/research/deeplab/datasets directory.
bash download_and_convert_ade20k.sh
The converted dataset will be saved at ./deeplab/datasets/ADE20K/tfrecord
+ datasets
- build_data.py
- build_ade20k_data.py
- download_and_convert_ade20k.sh
+ ADE20K
+ tfrecord
+ exp
+ train_on_train_set
+ train
+ eval
+ vis
+ ADEChallengeData2016
+ annotations
+ training
+ validation
+ images
+ training
+ validation
where the folder train_on_train_set
stores the train/eval/vis events and
results (when training DeepLab on the ADE20K train set).
A local training job using xception_65
can be run with the following command:
# From tensorflow/models/research/
python deeplab/train.py \
--logtostderr \
--training_number_of_steps=50000 \
--train_split="train" \
--model_variant="xception_65" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--decoder_output_stride=4 \
--train_crop_size=513 \
--train_crop_size=513 \
--train_batch_size=4 \
--min_resize_value=350 \
--max_resize_value=500 \
--resize_factor=16 \
--fine_tune_batch_norm=False \
--dataset="ade20k" \
--initialize_last_layer=False \
--last_layers_contain_logits_only=True \
--tf_initial_checkpoint=${PATH_TO_INITIAL_CHECKPOINT} \
--train_logdir=${PATH_TO_TRAIN_DIR}\
--dataset_dir=${PATH_TO_DATASET}
where ${PATH_TO_INITIAL_CHECKPOINT} is the path to the initial checkpoint.
For example, if you are using the deeplabv3_pascal_train_aug checkppoint, you
will set this to /path/to/deeplabv3\_pascal\_train\_aug/model.ckpt
.
${PATH_TO_TRAIN_DIR} is the directory in which training checkpoints and
events will be written to (it is recommended to set it to the
train_on_train_set/train
above), and ${PATH_TO_DATASET} is the directory in
which the ADE20K dataset resides (the tfrecord
above)
Note that for train.py:
-
In order to fine tune the BN layers, one needs to use large batch size (> 12), and set fine_tune_batch_norm = True. Here, we simply use small batch size during training for the purpose of demonstration. If the users have limited GPU memory at hand, please fine-tune from our provided checkpoints whose batch norm parameters have been trained, and use smaller learning rate with fine_tune_batch_norm = False.
-
User should fine tune the
min_resize_value
andmax_resize_value
to get better result. Note thatresize_factor
has to be equal tooutput_stride
. -
The users should change atrous_rates from [6, 12, 18] to [12, 24, 36] if setting output_stride=8.
-
The users could skip the flag,
decoder_output_stride
, if you do not want to use the decoder structure.
Currently there are no fine-tuned checkpoint for the ADE20K dataset.
Progress for training and evaluation jobs can be inspected using Tensorboard. If using the recommended directory structure, Tensorboard can be run using the following command:
tensorboard --logdir=${PATH_TO_LOG_DIRECTORY}
where ${PATH_TO_LOG_DIRECTORY}
points to the directory that contains the train
directorie (e.g., the folder train_on_train_set
in the above example). Please
note it may take Tensorboard a couple minutes to populate with data.