The model requires:
- Python3
- MXNet 1.3.0
- Sockeye 1.18.56
- CUDA
If you want to run sockeye on a GPU you need to make sure your version of Apache MXNet Incubating contains the GPU bindings. Depending on your version of CUDA you can do this by running the following:
> pip install -r requirements/requirements.gpu-cu${CUDA_VERSION}.txt
> pip install .
where ${CUDA_VERSION}
can be 75
(7.5), 80
(8.0), 90
(9.0), 91
(9.1), or 92
(9.2).
The preprocessed ENT-DESC dataset is saved in ./sockeye/data
. For more details regarding the data preparation step, please refer to ENT-DESC.
Before that, we need to convert the raw dataset into multi graphs for training. For details please refer to the paper.
To train the DCGCN model, run:
./train.sh
Model checkpoints and logs will be saved to ./sockeye/model
.
When we finish the training, we can use the trained model to decode on the test set, run:
./decode.sh
This will use the last checkpoint by default. Use --checkpoints
to specify a model checkpoint file.
For BLEU score evaluation, run:
python3 -m sockeye.evaluate -r sockeye/data/ENT-DESC\ dataset/test_surface.pp.txt -i sockeye/data/ENT-DESC\ dataset/test.snt.out
@inproceedings{cheng2020ent,
title={ENT-DESC: Entity Description Generation by Exploring Knowledge Graph},
author={Cheng, Liying and Wu, Dekun and Bing, Lidong and Zhang, Yan and Jie, Zhanming and Lu, Wei and Si, Luo},
booktitle={Proceedings of EMNLP},
year={2020}
}