VED stage
For this stage, we used Detectron which is Facebook AI Research's software system that implements state-of-the-art object detection algorithms. It is written in Python and powered by the Caffe2 deep learning framework. We used the Faster RCNN with FPN model given at https://github.com/facebookresearch/Detectron for detecting different elements in the plot images.
Follow these instructions to train the Faster RCNN with FPN model on the PlotQA dataset:
- Install Caffe2 and Detecron. You might have to use this script to install Caffe2 on a AWS GPU instance.
- Download the PlotQA directory which contains the images and the coco-style annotations from here and extract it in the directory: '~/Detectron/detectron/datasets/data/.'
- Replace the
~/Detectron/detectron/datasets/dataset_catalog.py
with the file which can be downloaded from here. - Download the
e2e_faster_rcnn_R-50-FPN_1x.yaml
config file from here. This file has the hyperparameter settings that we have used for training. - Run the following command to start training followed by testing:
python tools/train_net.py --cfg [PATH_TO_THE_CONFIG]/e2e_faster_rcnn_R-50-FPN_1x.yaml OUTPUT_DIR [PATH_TO_OUTPUT_DIR]
The model predictions will be saved in the file [OUTPUT_DIR]/test/coco_val/generalized_rcnn/bbox_coco_val_results.json
.
You can download the saved weights from here.
Requirements
Before moving to the next stage, you might need to install some packages like click, tqdm, pyocr, matplotlib, etc.
You can refer to .sh
files in this directory for the exact commands that we used to install a particular package.
OCR and SIE stage
- Download the python script
generate_detections_for_fasterrcnn.py
from here and run the following command to convert the model predictions obtained in the VED stage into the format required by the successive stages.
python2 generate_detections_for_fasterrcnn.py [OUTPUT_DIR]/test/coco_val/generalized_rcnn bbox_coco_val_results.json detections
This will store the modified model predictions at [OUTPUT_DIR]/test/coco_val/generalized_rcnn/detections
. The detections are of the format: CLASS_LABEL CLASS_CONFIDENCE XMIN YMIN XMAX YMAX
where (XMIN, YMIN) and (XMAX, YMAX) refers to the top-left and bottom-right co-ordinates of the predicted bounding box respectively.
- Download the code directory from here. Run the following command to extract the plot information inito a semi-strucutred table:
python ocr_and_sie.py [PATH_TO_PNG_DIR] [PATH_TO_DETECTIONS] [OUTPUT_DIR]
This command will store the tables into .csv
format in the [OUTPUT_DIR]
Table QA stage
Install SEMPRE using the following links:
After making sure that it works for the WikiTables dataset, place this directory in this directory: [PATH_TO_SEMPRE]/sempre/lib/data/.
Replace the [PATH_TO_SEMPRE]/run
ruby script with [PATH_TO_SEMPRE]/run_plotqa
. The changes where the training and testing files needs to be modified have been mentioned with the comment "PlotQA". Download the run_plotqa
script from here.
You can download the saved weights from here and place it in this directory [PATH_TO_SEMPRE]/state/execs/.
Run this command for testing:
./run_plotqa @cldir=0 @mode=tables @data=test @feat=all @train=1 @memsize=high -Parser.beamSize 50 -maxExamples train:100 -Builder.inParamsPath state/execs/5.exec/params
If you wish to create your own .examples
file, run the notebook QA_to_Lisp.ipynb
after constructing the tables in csv format and replace the corresponsing filename in the 'run_plotqa' script. You can download the notebook from here.
If you wish to train the model from scratch, use the training command as mentioned in this repository.