Vrije Universiteit Amsterdam Computational Lexicology and Terminology Lab Department of Language and Communication Faculty of Humanities
To run the feature extraction notebooks in the CAMB, CAMB_A and Final_system folders, you will need to download Stanford CoreNLP here and then navigate to the stanford-corenlp-4.5.4 folder and start core with “% java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 15000
The dataset used to train these models was collected by Yimam et al. (2018) and is available here.
This repository consists of a series of notebooks investigating feature-based aproaches for complex word identification in English.
Available here (https://www.overleaf.com/read/wmvwtmpbkvqs)