- I would like to present an image segmentation application on drone images dataset on here. Dataset contains 600 drone images acquired at an altitude of 5 to 30 meters above ground with masks concerning 23 classes. RGB masks are also provides.
- Task is to train an algorirtm to accurately predict the segmentation of each classes. I would like to present two different models here;
- Part1: I trained Unet, which is probably the most well-known arhitecture, from scratch. Have a look at the original paper here.
- Part2: I worked on three different pretrained arhitectures Unet++, Linknet and DeepLabV3+ with and without freezing the models. Details and more models can be found in Segmentation Models Github.
- Here are the scoes for both models;
-
Part1: intesection-over-union-score: 0.21 pixel_accuracy: 0.71:
-
Part2: intesection-over-union-score: 0.47 pixel_accuracy: 0.84
-
Hyperparameter optimization is performed with Optuna library. It allows us to quickly identify the optimal hyperparamters resulting the best model. Besides from searching the optimal parameters in hyperparamater space, Optuna also provides several early stopping strategies to prune unpromising trials earlier to save the training time. In each notebook, I detailed the training stages clearly.
-
Intersction-Over-Union is used as a cost function in training to better handle the low frequency classes.
-
Data augmentation is performed via Albumentation library. You may be used to torchvision but Albumentation is faster and more versatile.
-
In order to speep-up the training, resizing was performed before training. I realized that this step is usually overlooked in many application. However, resizing is a big bottlenect significantly increasing the training time.
-
I greatly benefited from Aladdin Persson, Abhishek Thakur and Amirhossein Heydarian .
-
Please let me know if you have any feedback!