-
Notifications
You must be signed in to change notification settings - Fork 0
EditDistanceEDA
EditDistanceEDA is an Entailment Decision Algorithm (EDA) based the textual entailment system EDITS, developed and maintained at FBK.
Running EditDistanceEDA should not require additional installation or building steps apart from setting up the EOP. The remainder of this document describes the possible configurations for EditDistanceEDA.
We provide several configuration files located under /core/src/main/resources/configuration-file/. The structure and values in these configuration files are explained below.
Section | Property | Value | Requirement |
---|---|---|---|
PlatformConfiguration | activatedEDA | The common setting for selecting the EDA. The default value here is eu.excitementproject.eop.core.EditDistanceEDA. | N/A |
PlatformConfiguration | language | For the moment, EditDistanceEDA supports English (EN), German (DE), and Italian (IT). In principle, the EDA is language-independent. | N/A |
PlatformConfiguration | activatedLAP | The linguistic analysis pipeline needed to produce input for the EDA. | N/A |
eu.excitementproject.eop.core.<br /> EditDistanceEDA | modelFile | The location where the trained model is stored. The default location is under core/src/main/resources/model/. We use a convention that gives informative names to the models -- they include the name of the EDA used to produce them, the language, and additional information regarding the settings used. | For training, the model file should NOT exist. |
eu.excitementproject.eop.core.<br /> EditDistanceEDA | trainDir | The directory containing the training data, as produced by the LAP (in xmi format). | The directory should exist. |
eu.excitementproject.eop.core.<br /> EditDistanceEDA | testDir | The directory containing the test data, as produced by the LAP (in xmi format). | The directory should exist. |
eu.excitementproject.eop.core.<br /> EditDistanceEDA | components | The components used by the EditDistanceEDA for distance computations, separated by commas. The components may require themselves additional parameters, which are specified in sections specific to each of them. These sections are identified through the name of the component provided as value through this XML tag. | N/A |
eu.excitementproject.eop.core.<br /> component.distance.<br /> FixedWeightTokenEditDistance | instances | The token-based edit distance component using fixed weights. The instance specifies the value of a subsection, which contains the parameters needed to use this component. | To be able to use this components, the LAP should provide token and lemma annotations (Currently only TreeTagger provides this for all three languages, and TextPro for Italian). |
basic / wordnet | stopWordRemoval | Can be true or false, and indicates to the distance computation component whether to filter stop words or not | |
wordnet | path | The path to the particular WordNet resource used. The English WordNet is freely distributed and is included in the release. The Italian WordNet is also free but must be obtained through request from FBK. Details are provided in the Doc for the Italian knowledge resources. GermaNet is properietary. Details about the resource and how to obtain it are provided in the Doc for the German knowledge resources |
Do adjust the distance computation for a specific language, the user should use the wordnet value for the instances property of the distance computation component, and give the path to the desired resource in the corresponding subsection of the configuration file, as described above.