Skip to content

Latest commit

 

History

History

data

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 

Accompanying Data

This folder contains the following data:

  • waldo_and_wenda.csvWaldo and Wenda benchmark for HHI understanding
  • imsitu-hhi.txt – IDs for imSitu-HHI subset of the imSitu dataset
  • phhi.csvpHHI (pseudo-labels indicating HHI) for the Who's Waldo dataset

See the sections below for instructions on using these, as well as download instructions for synthetic caption data.

Waldo and Wenda

The file waldo_and_wenda.csv contains metadata and ground-truth annotations for the 1,000–item Waldo and Wenda HHI understanding benchmark. The source column indicates images from:

WW images can be obtained by requesting access to the WW dataset (see its homepage for details). CC and COCO images are available via the listed URLs (CC source). We do not reproduce image files here; see each dataset for its respective licensing details and see below for the licensing of our additions.

The caption column provides ground-truth captions from the source datasets. Note that WW captions have named person entities replaced with an underscore, and COCO samples use the first reference from the original dataset as the listed caption.

The id column contains a unique identifier for each item. For those from WW and COCO, these are the original identifiers from those datasets. For items from CC, these are the first five digits of the MD5 hash of the corresponding image URL.

imSitu-HHI

The file imsitu-hhi.txt lists the items from the imSitu dataset that comprise the imSitu-HHI subset as described in our paper.

pHHI

The file phhi.csv contains HHI pseudo-labels for relevant items in the Who's Waldo dataset. These have been preprocessed as described in our paper, including to avoid overlap with the test items in Waldo and Wenda.

Alternatively, you may generate these yourself using the code in the pseudo-labeling subrepo.

For image data, please request access to Who's Waldo as described above.

Synthetic Caption Data

You may download the synthetic caption data synthetic_captions.csv.gz (used for training summarization model) at this link.

License

Data from the Who's Waldo, Conceptual Captions, Microsoft COCO, and imSitu datasets are licensed according to the licensing terms of each respective dataset. We license our data contributions (ground-truth pseudo-label annotations) under the non-commercial CC BY-NC-SA 4.0 license.