Skip to content
vnastase edited this page Jun 21, 2013 · 23 revisions

EditDistanceEDA is an Entailment Decision Algorithm (EDA) based the textual entailment system EDITS, developed and maintained at FBK.

Running EditDistanceEDA should not require additional installation or building steps apart from setting up the EOP. The remainder of this document describes the possible configurations for EditDistanceEDA.

Configuration File

We provide several configuration files located under /core/src/main/resources/configuration-file/. The structure and values in these configuration files are explained below.

Common settings

Section Property Value Requirement
PlatformConfiguration activatedEDA The common setting for selecting the EDA. The default value here is eu.excitementproject.eop.core.EditDistanceEDA. N/A
PlatformConfiguration language For the moment, EditDistanceEDA supports English (EN), German (DE), and Italian (IT). In principle, the EDA is language-independent. N/A
PlatformConfiguration activatedLAP The linguistic analysis pipeline needed to produce input for the EDA. N/A
eu.excitementproject.eop.core.<br /> EditDistanceEDA modelFile The location where the trained model is stored. The default location is under core/src/main/resources/model/. We use a convention that gives informative names to the models -- they include the name of the EDA used to produce them, the language, and additional information regarding the settings used. For training, the model file should NOT exist.
eu.excitementproject.eop.core.<br /> EditDistanceEDA trainDir The directory containing the training data, as produced by the LAP (in xmi format). The directory should exist.
eu.excitementproject.eop.core.<br /> EditDistanceEDA testDir The directory containing the test data, as produced by the LAP (in xmi format). The directory should exist.
eu.excitementproject.eop.core.<br /> EditDistanceEDA components The components used by the EditDistanceEDA for distance computations, separated by commas. The components may require themselves additional parameters, which are specified in sections specific to each of them. These sections are identified through the name of the component provided as value through this XML tag. N/A
eu.excitementproject.eop.core.<br /> component.distance.<br /> FixedWeightTokenEditDistance instances The token-based edit distance component using fixed weights. The instance specifies the value of a subsection, which contains the parameters needed to use this component. To be able to use this components, the LAP should provide token and lemma annotations (Currently only TreeTagger provides this for all three languages, and TextPro for Italian).
basic / wordnet stopWordRemoval Can be true or false, and indicates to the distance computation component whether to filter stop words or not
wordnet path The path to the particular WordNet resource used. The English WordNet is freely distributed and is included in the release. The Italian WordNet is also free but must be obtained through request from FBK. Details are provided in the Doc for the Italian knowledge resources. GermaNet is properietary. Details about the resource and how to obtain it are provided in the Doc for the German knowledge resources

Specific language settings

Do adjust the distance computation for a specific language, the user should use the wordnet value for the instances property of the distance computation component, and give the path to the desired resource in the corresponding subsection of the configuration file, as described above.