Images of dense urban areas of Tokyo, Japan taken from a high-altitude, steep (but not vertical) angle viewpoint. The listing of the areas are below:
For every image the heads of pedestrians (including bikers) are annotated. The annotation format is a list of (x,y) coordinates in a .mat file.
The complete training set and nearly half of the validation set includes segmentation annotation, for seven classes:
- Building - 0
- Structure - 1
- Plant - 2
- Sidewalk - 3
- Vehicle - 4
- Road - 5
- Background - 6
The annotation format is single color channel .png file with seven possible color values. The respective color values are listed next to each class.
If you find our dataset useful, please cite our paper
Csönde, G.; Sekimoto, Y.; Kashiyama, T. Crowd Counting with Semantic Scene Segmentation in Helicopter Footage. Sensors 2020, 20, 4855.
Images on this dataset are available under the Creative Commons Attribution-ShareAlike 4.0 International License (CC BY-SA 4.0). The license and link to the legal document can be found next to every image on the service in the image information panel and contains the CC BY-SA 4.0 mark: