-
Notifications
You must be signed in to change notification settings - Fork 0
EditDistanceEDA
EditDistanceEDA is an Entailment Decision Algorithm (EDA) based the textual entailment system EDITS, developed and maintained at FBK.
Running EditDistanceEDA should not require additional installation or building steps apart from setting up the EOP. The remainder of this document describes the possible configurations for EditDistanceEDA.
We provide several configuration files located under /core/src/main/resources/configuration-file/. The structure and values in these configuration files are explained below.
Section | Property | Value | Requirement |
---|---|---|---|
PlatformConfiguration | activatedEDA | The common setting for selecting the EDA. The default value here is eu.excitementproject.eop.core.EditDistanceEDA. | N/A |
PlatformConfiguration | language | For the moment, EditDistanceEDA supports English (EN), German (DE), and Italian (IT). In principle, the EDA is language-independent. | N/A |
PlatformConfiguration | activatedLAP | The linguistic analysis pipeline needed to produce input for the EDA. | N/A |
eu.excitementproject.eop.core.<br /> EditDistanceEDA | modelFile | The location where the trained model is stored. The default location is under core/src/main/resources/model/. We use a convention that gives informative names to the models -- they include the name of the EDA used to produce them, the language, and additional information regarding the settings used. | For training, the model file should NOT exist. |
eu.excitementproject.eop.core.<br /> EditDistanceEDA | trainDir | The directory containing the training data, as produced by the LAP (in xmi format). | The directory should exist. |
eu.excitementproject.eop.core.<br /> EditDistanceEDA | testDir | The directory containing the test data, as produced by the LAP (in xmi format). | The directory should exist. |
eu.excitementproject.eop.core.<br /> EditDistanceEDA | components | The components used by the EditDistanceEDA for distance computations, separated by commas. The components may require themselves additional parameters, which are specified in sections specific to each of them. These sections are identified through the name of the component provided as value through this XML tag. | N/A |
eu.excitementproject.eop.core.<br /> component.distance.<br /> FixedWeightTokenEditDistance | instances | The token-based edit distance component using fixed weights. The instance specifies the value of a subsection, which contains the parameters needed to use this component. | To be able to use this components, the LAP should provide token and lemma annotations (Currently only TreeTagger provides this for all three languages, and TextPro for Italian). |
basic / wordnet | stopWordRemoval | Can be true or false, and indicates to the distance computation component whether to filter stop words or not | |
wordnet | path | The path to the particular WordNet resource used. The English WordNet is freely distributed and is included in the release. The Italian WordNet is also free but must be obtained through request from FBK. Details are provided in the Doc for the Italian knowledge resources. GermaNet is properietary. Details about the resource and how to obtain it are provided in the Doc for the German knowledge resources |
Notice that the English lexical resources, WordNet and VerbOcean, need to be properly installed in order to run the following configurations respectively.
Section | Property | Value | Requirement |
---|---|---|---|
BagOfLexesScoring | WordnetLexicalResource | It indicates the usage of the WordNet. The value indicates the relations used separated by comma. The default value is the relations related to entailment, i.e., HYPERNYM,SYNONYM,PART_HOLONYM. There is a separate section for further settings. | N/A |
WordnetLexicalResource | wordNetFilesPath (deprecated) | The path to the location of WordNet. The default value is ./src/main/resources/ontologies/EnglishWordNet-dict/. | N/A |
WordnetLexicalResource | isCollapsed | Whether to query the WordNet with all the selected relations together or separately. The default value is true. | N/A |
WordnetLexicalResource | useFirstSenseOnlyLeft | Whether to query the WordNet with only the first sense on the left hand side of the relation. The default value is false. | N/A |
WordnetLexicalResource | useFirstSenseOnlyRight | Whether to query the WordNet with only the first sense on the right hand side of the relation. The default value is false. | N/A |
BagOfLexesScoring | VerbOceanLexicalResource | It indicates the usage of the VerbOcean. The value indicates the relations used separated by comma. The default value is the relations related to entailment, i.e., StrongerThan,CanResultIn,Similar. There is a separate section for further settings. | N/A |
VerbOceanLexicalResource | verbOceanFilePath (deprecated) | The path to the location of WordNet. The default value is ./src/main/resources/VerbOcean/verbocean.unrefined.2004-05-20.txt. | N/A |
VerbOceanLexicalResource | isCollapsed | Whether to query the VerbOcean with all the selected relations together or separately. The default value is true. | N/A |
Notice that the German lexical resources, GermaNet, DistSim, and DerivBase, need to be properly installed in order to run the following configurations respectively.
Section | Property | Value | Requirement |
---|---|---|---|
BagOfLexesScoring | withPOS | Whether the bag-of-lexes scoring component will include POS in the queries to the lexical resources. The default value is false. | N/A |
BagOfLexesScoring | GermanDistSim | It indicates the usage of the German distributional similarity resource. There is a separate section for further settings. | N/A |
BagOfLexesScoring | GermaNetWrapper | It indicates the usage of the GermaNet. The value indicates the relations used, separated by comma. The default value is the relations related to entailment, i.e., Causes,Entails,Has_Hypernym,Has_Synonym. There is a separate section for further settings. | GermaNet should be properly installed and the path should be correctly specified. |
BagOfLexesScoring | DerivBaseResource | It indicates the usage of the German derivational resource. There is a separate section for further settings. | It is only triggered when withPOS is turned on. |