Autonomous cars require strong perception systems. One of the methods to have this strong perception is the semantic segmentation of the elements in the road, using convolutional neural networks. In this project, use is made of the ERFNet architecture for the convolutional neural network and the database BDD100K to produce a semantic segmentation in real time for the following labels:
- Pedestrians
- Road Curb
- White lane
- Double yellow lane
- Yellow lane
- Small vehicles
- Medium vehicles
- Big vehicles
- Drivable lane
- Alternative lane
The algorithms developed for the lanes and obstacles detection were implemented using Python 2.7. This implementation add a weight matrix to keep a balance in between the classes at the training process. Also, now it's possible to start the training since the last epoch who has been trained.
When the training process ends the following files are generated:
-
automatedlog.txt: This file keep the values corresponding to the learning rate and the precision values in each epoch
-
best.txt: have the number of the epoch with the best learning rate
-
Results in each epoch: For each epoch there are generated 2 text files who indicates the learning rate for each class in training and validation
-
Best model of the network: save the model of the network who produces the best results and their parameters
Here are some of the examples of the images for the training set, after setting the required labels:
Finally, the best results were the following:
Classes | IOU% |
---|---|
Pedestrians | 76.60 |
Road Curb | 44.31 |
White Lane | 32.53 |
Double yellow lane | 63.38 |
Yellow lane | 60.61 |
Small vehicles | 37.70 |
Medium vehicles | 74.38 |
Big vehicles | 93.49 |
Drivable lane | 88.24 |
Alternative lane | 76.27 |
The IOU value, is a relation in between True Positives, False Positives and Falses Negatives, described by the next equation:
where:
- TP: True positives
- FP: False positives
- FN: False negatives
The real-time testing was implemented using the framework ROS, in the Kinetic version, and rosbag files. The semantic segmentation in real-time looks as the following: