Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
imsitu-hhi.txt		imsitu-hhi.txt
phhi.csv		phhi.csv
waldo_and_wenda.csv		waldo_and_wenda.csv

README.md

Accompanying Data

This folder contains the following data:

waldo_and_wenda.csv – Waldo and Wenda benchmark for HHI understanding
imsitu-hhi.txt – IDs for imSitu-HHI subset of the imSitu dataset
phhi.csv – pHHI (pseudo-labels indicating HHI) for the Who's Waldo dataset

See the sections below for instructions on using these, as well as download instructions for synthetic caption data.

Waldo and Wenda

The file waldo_and_wenda.csv contains metadata and ground-truth annotations for the 1,000–item Waldo and Wenda HHI understanding benchmark. The source column indicates images from:

ww – Who's Waldo (300 items)
cc – Conceptual Captions (400 items, from val set)
coco – Microsoft COCO (300 items, from val2014 set)

WW images can be obtained by requesting access to the WW dataset (see its homepage for details). CC and COCO images are available via the listed URLs (CC source). We do not reproduce image files here; see each dataset for its respective licensing details and see below for the licensing of our additions.

The caption column provides ground-truth captions from the source datasets. Note that WW captions have named person entities replaced with an underscore, and COCO samples use the first reference from the original dataset as the listed caption.

The id column contains a unique identifier for each item. For those from WW and COCO, these are the original identifiers from those datasets. For items from CC, these are the first five digits of the MD5 hash of the corresponding image URL.

imSitu-HHI

The file imsitu-hhi.txt lists the items from the imSitu dataset that comprise the imSitu-HHI subset as described in our paper.

pHHI

The file phhi.csv contains HHI pseudo-labels for relevant items in the Who's Waldo dataset. These have been preprocessed as described in our paper, including to avoid overlap with the test items in Waldo and Wenda.

Alternatively, you may generate these yourself using the code in the pseudo-labeling subrepo.

For image data, please request access to Who's Waldo as described above.

Synthetic Caption Data

You may download the synthetic caption data synthetic_captions.csv.gz (used for training summarization model) at this link.

License

Data from the Who's Waldo, Conceptual Captions, Microsoft COCO, and imSitu datasets are licensed according to the licensing terms of each respective dataset. We license our data contributions (ground-truth pseudo-label annotations) under the non-commercial CC BY-NC-SA 4.0 license.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

README.md

Accompanying Data

Waldo and Wenda

imSitu-HHI

pHHI

Synthetic Caption Data

License

Files

data

Directory actions

More options

Directory actions

More options

Latest commit

History

data

Folders and files

parent directory

README.md

Accompanying Data

Waldo and Wenda

imSitu-HHI

pHHI

Synthetic Caption Data

License