This is an open source implementation of our approach to creating synthetic treebanks for cross-lingual dependency parsing. It includes a combination of:
- machine translation
- annotation projection
- maximum spanning tree algorithm
Francis Tyers, Mariya Sheyanova, Aleksandra Martynova, Pavel Stepachev and Konstantin Vinogorodskiy. Multi-source synthetic treebank creation for improved cross-lingual dependency parsing In Proceedings of the Second Workshop on Universal Dependencies (UDW 2018) EMNLP18
A part of software created during our research is reused in a feature UD Annotatrix. The feature aims to make treebanks annotation easier by automatically annotating the sentences.
At the moment, we have a custom extantion of UD Annotatrix with this feature.
In future, it will be merged to the main repository.
Q: How to use feature UD Annotatrix?
A: Please, read our guide.
Q: How to deploy?
A: Please, read our deploy guide.
Q: Where can I find all useful information about UD Annotatrix tool?
A: Please, check the main repository.
If you use this software for academic research, please cite the paper in question:
@InProceedings{W18-6017,
author = "Tyers, Francis
and Sheyanova, Mariya
and Martynova, Aleksandra
and Stepachev, Pavel
and Vinogorodskiy, Konstantin",
title = "Multi-source synthetic treebank creation for improved cross-lingual dependency parsing",
booktitle = "Proceedings of the Second Workshop on Universal Dependencies (UDW 2018)",
year = "2018",
publisher = "Association for Computational Linguistics",
pages = "144--150",
location = "Brussels, Belgium",
url = "http://aclweb.org/anthology/W18-6017"
}
- Olga Lyashevskaya
- Francis Tyers
- Kostya Vinogorodskiy [email protected]
- Sasha Martynova [email protected]
- Pasha Stepachev [email protected]
- Masha Sheyanova [email protected]
The article was prepared within the framework of the Academic Fund Programme at the National Research University Higher School of Economics (HSE) in 2016 — 2018 (grant No17-05-0043) and by the Russian Academic Excellence Project «5-100».