AMR Parsing as Sequence-to-Graph Transduction

Code for the AMR Parser in our ACL 2019 paper "AMR Parsing as Sequence-to-Graph Transduction".

If you find our code is useful, please cite:

@inproceedings{zhang-etal-2018-stog,
    title = "{AMR Parsing as Sequence-to-Graph Transduction}",
    author = "Zhang, Sheng and
      Ma, Xutai and
      Duh, Kevin and
      Van Durme, Benjamin",
    booktitle = "Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    month = jul,
    year = "2019",
    address = "Florence, Italy",
    publisher = "Association for Computational Linguistics"
}

1. Environment Setup

The code has been tested on Python 3.6 and PyTorch 0.4.1. All other dependencies are listed in requirements.txt.

Via conda:

conda create -n stog python=3.6
source activate stog
pip install -r requirements.txt

2. Data Preparation

Download Artifacts:

./scripts/download_artifacts.sh

Assuming that you're working on AMR 2.0 (LDC2017T10), unzip the corpus to data/AMR/LDC2017T10, and make sure it has the following structure:

(stog)$ tree data/AMR/LDC2017T10 -L 2
data/AMR/LDC2017T10
├── data
│   ├── alignments
│   ├── amrs
│   └── frames
├── docs
│   ├── AMR-alignment-format.txt
│   ├── amr-guidelines-v1.2.pdf
│   ├── file.tbl
│   ├── frameset.dtd
│   ├── PropBank-unification-notes.txt
│   └── README.txt
└── index.html

Prepare training/dev/test data:

./scripts/prepare_data.sh -v 2 -p data/AMR/LDC2017T10

3. Feature Annotation

We use Stanford CoreNLP (version 3.9.2) for lemmatizing, POS tagging, etc.

First, start a CoreNLP server following the API documentation.

Then, annotate AMR sentences:

./scripts/annotate_features.sh data/AMR/amr_2.0

4. Data Preprocessing

./scripts/preprocess_2.0.sh

5. Training

Make sure that you have at least two GeForce GTX TITAN X GPUs to train the full model.

python -u -m stog.commands.train params/stog_amr_2.0.yaml

6. Prediction

python -u -m stog.commands.predict \
    --archive-file ckpt-amr-2.0 \
    --weights-file ckpt-amr-2.0/best.th \
    --input-file data/AMR/amr_2.0/test.txt.features.preproc \
    --batch-size 32 \
    --use-dataset-reader \
    --cuda-device 0 \
    --output-file test.pred.txt \
    --silent \
    --beam-size 5 \
    --predictor STOG

7. Data Postprocessing

./scripts/postprocess_2.0.sh test.pred.txt

8. Evaluation

Note that the evaluation tool works on python2, so please make sure python2 is visible in your $PATH.

./scripts/compute_smatch.sh test.pred.txt data/AMR/amr_2.0/test.txt

Pre-trained Models

Here are pre-trained models: ckpt-amr-2.0.tar.gz and ckpt-amr-1.0.tar.gz. To use them for prediction, simply download & unzip them, and then run Step 6-8.

In case that you only need the pre-trained model prediction (i.e., test.pred.txt), you can find it in the download.

Acknowledgements

We adopted some modules or code snippets from AllenNLP, OpenNMT-py and NeuroNLP2. Thanks to these open-source projects!

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
params		params
scripts		scripts
stog		stog
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AMR Parsing as Sequence-to-Graph Transduction

1. Environment Setup

2. Data Preparation

3. Feature Annotation

4. Data Preprocessing

5. Training

6. Prediction

7. Data Postprocessing

8. Evaluation

Pre-trained Models

Acknowledgements

License

About

Releases

Packages

Languages

License

sheng-z/stog

Folders and files

Latest commit

History

Repository files navigation

AMR Parsing as Sequence-to-Graph Transduction

1. Environment Setup

2. Data Preparation

3. Feature Annotation

4. Data Preprocessing

5. Training

6. Prediction

7. Data Postprocessing

8. Evaluation

Pre-trained Models

Acknowledgements

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages