Though we have provided the processed datasets in an easy-to-use format and recommend to use them directly, here we also roughly record the data downloading and clearning process to make it more transparent and reproducible. The Juptyer notebooks here provide some data visualization as well as the processing script for OmniACT. The script records the processing steps for other GUI grounding datasets.
Here are the steps to download the raw data for GUI grounding. The processing scripts are roughly recorded in scripts/process_grounding_data.py
.
mkdir raw_data
cd raw_data
mkdir mind2web
cd mind2web
wget "https://box.nju.edu.cn/f/33e203d170ab48b0b922/?dl=1" -O mind2web_images.zip
wget "https://box.nju.edu.cn/f/e30b861fa7604668821b/?dl=1" -O mind2web_annots.zip
unzip mind2web_images.zip -x "__MACOSX/*"
unzip mind2web_annots.zip -x "__MACOSX/*"
cd ..
mkdir screenspot
cd screenspot
wget "https://box.nju.edu.cn/d/5b8892c1901c4dbeb715/files/?p=%2Fscreenspot_desktop.json&dl=1" -O screenspot_desktop.json
wget "https://box.nju.edu.cn/d/5b8892c1901c4dbeb715/files/?p=%2Fscreenspot_imgs.tar&dl=1" -O screenspot_imgs.tar
wget "https://box.nju.edu.cn/d/5b8892c1901c4dbeb715/files/?p=%2Fscreenspot_mobile.json&dl=1" -O screenspot_mobile.json
wget "https://box.nju.edu.cn/d/5b8892c1901c4dbeb715/files/?p=%2Fscreenspot_web.json&dl=1" -O screenspot_web.json
tar --exclude="__MACOSX" -xvf screenspot_imgs.tar
cd ..
mkdir aitw
cd aitw
wget "https://box.nju.edu.cn/f/96ba5115bae24eaaa44e/?dl=1" -O aitw_images.zip
wget "https://box.nju.edu.cn/f/1245c74fc09b4565a235/?dl=1" -O aitw_annots.zip
unzip aitw_images.zip -x "__MACOSX/*"
unzip aitw_annots.zip -x "__MACOSX/*"
cd ..
Download motif_complete.tar.gz from here
Download the cleaned json files from here
Download from huggingface. Unzip the data.zip
. We only use the screenshots in data/data/{desktop|web}
folder and the annotations in data/metadata/{desktop|web}/boxes
folder.