type | title |
---|---|
article |
SUNSPOT Dataset |
We introduce a new dataset, SUN-Spot, for localizing objects using spatial referring expressions (REs). SUN-Spot is the only RE dataset which uses RGB-D images. It also contains a greater average number of spatial prepositions and more cluttered scenes than previous RE datasets. Using a simple baseline, we show that including a depth channel in RE models can improve performance on both generation and comprehension.
C. Mauceri, M. Palmer, and C. Heckman, “SUN-Spot: An RGB-D Dataset With Spatial Referring Expressions,” in International Conference on Computer Vision Workshop on Closing the Loop Between Vision and Language , 2019.
{% raw %}
@inproceedings{Mauceri2019,
author = {Mauceri, Cecilia and Palmer, Martha and Heckman, Christoffer},
booktitle = {International Conference on Computer Vision Workshop on Closing the Loop Between Vision and Language},
title = {{SUN-Spot: An RGB-D Dataset With Spatial Referring Expressions}},
year = {2019}
}
{% endraw %}
- SUNRGBD Images
- SUN-Spot Annotations
- refs(boulder).p - The referring expressions
- instances.json - The SUNRGBD annotations in COCO format
- vocab.txt - The vocabulary
- Code Repository
To provide a taste of the images and annotations in SUN-Spot, here are 10 randomly selected objects from the dataset with all of their referring expressions.
{::options parse_block_html="true" /}
{% for object in site.data.sunspot_visualization %}