This repository contains the code for the EMNLP-IJCNLP 2019 paper: "Enhancing AMR-to-Text Generation with Dual Graph Representations".
This repository contains experimental software and is published for the sole purpose of giving additional background details on the respective publication.
This project is implemented using the framework OpenNMT-py and the library PyTorch Geometric. Please, refer to their website for further details on the installation and dependencies.
- python 3
- PyTorch 1.1.0
- PyTorch Geometric 1.3.0
- nltk
- parsimonious
In our experiments, we use the following datasets: LDC2015E86 and LDC2017T10.
First, convert the dataset into the format required for the model.
For the LDC2015E86 dataset, run:
./preprocess_LDC2015E86.sh <dataset_folder> <glove_emb_file>
For the LDC2017T10 dataset, run:
./preprocess_LDC2017T10.sh <dataset_folder> <glove_emb_file>
For traning the model using the LDC2015E86 dataset, execute:
./train_LDC2015E86.sh <gpu_id> <gnn_type> <gnn_layers> <start_decay_steps> <decay_steps>
For the LDC2017T10 dataset, execute:
./train_LDC2017T10.sh <gpu_id> <gnn_type> <gnn_layers> <start_decay_steps> <decay_steps>
Options for <gnn_type>
are ggnn
, gat
or gin
. <gnn_layers>
is the number of graph layers. Refer to OpenNMT-py for <start_decay_steps>
and <decay_steps>
.
We lower the learning rate during training, after some epochs, as in Konstas et al. (2017).
Examples:
./train_LDC2015E86.sh 0 gin 2 6720 4200
./train_LDC2017T10.sh 0 ggnn 5 14640 10980
- GIN-DualGraph trained on LDC2015E86 training set - BLEU on test set: 24.60 (download)
- GAT-DualGraph trained on LDC2015E86 training set - BLEU on test set: 24.98 (download)
- GGNN-DualGraph trained on LDC2015E86 training set - BLEU on test set: 25.01 (download)
- GIN-DualGraph trained on LDC2017T10 training set - BLEU on test set: 28.05 (download)
- GAT-DualGraph trained on LDC2017T10 training set - BLEU on test set: 27.26 (download)
- GGNN-DualGraph trained on LDC2017T10 training set - BLEU on test set: 28.26 (download)
The output generated by the GGNN-DualGraph model trained on LDC2017T10 can be found here.
For decode on the test set, run:
./decode.sh <gpu_id> <model> <nodes_file> <node1_file> <node2_file> <output>
Example:
./decode.sh 0 model_ggnn_ldc2015e86.pt test-amr-nodes.txt test-amr-node1.txt test-amr-node2.txt output-ggnn-test-ldc2015e86.txt
For more details regading hyperparameters, please refer to OpenNMT-py and PyTorch Geometric.
Contact person: Leonardo Ribeiro, [email protected]
@inproceedings{ribeiro-etal-2019-dualgraph,
title = "Enhancing {AMR}-to-Text Generation with Dual Graph Representations",
author = "Ribeiro, Leonardo F. R. and
Gardent, Claire and
Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)",
month = nov,
year = "2019",
address = "Hong Kong, China",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/D19-1314",
pages = "3174--3185",
}