Text-Understanding

Predicate Argument Annotations for Natural Language text.

About

This software annotates predicate-argument structures of natural-language texts, by utilizing information from the text's syntactic-analysis.

For example, given the sentence "I have a computer that can process natural languages.", the software extracts the following:

have
(SUBJECT) I
(OBJECT) computer

process
(SUBJECT) computer
(OBJECT) natural languages

Predicate-argument structures can be thought of as a shallower approach than Semantic Role Labeling, yet more semantically expressive than syntactic parse-trees.

The annotation tool is named PASTA, which stands for Predicate Argument Structure Annotator.

Usage

Embedding

Assuming a Maven project, the following dependency should be added to the POM file

<dependency>
    <groupId>com.github.asher-stern</groupId>
    <artifactId>text-understanding</artifactId>
    <version>1.0.2</version>
</dependency>

Demo

The project includes a demo program: com.as.text_understanding.pasta.DemoPasta. This is a command-line program which takes no arguments. It reads sentences interactively from the standard input, and prints their predicate-argument structures.

API

PASTA can be used in two ways: as a stand-alone tool, or as a UIMA annotator.

Stand-alone

As a stand-alone tool, the user has to provide a constituency parse-tree as the input. The parse-tree should be in the format of com.as.text_understanding.representation.tree.TreeNode, and be converted to TreeTravelNode (using the static method TreeTravelNode.createFromTree(), which is the input-argument of com.as.text_understanding.pasta.Pasta

An easy way to obtain such a parse-tree is to use UIMA DKPro libarary just for the syntactic analysis, and convert it to TreeNode using TreeBuilderFromDkpro. This is easy since all the required libararies are included in the project's POM, and no models or parameters are needed (they are all provided seemlessly by DKPro). An example is provided in com.as.text_understanding.pasta.DemoPasta.

UIMA annotator

As a UIMA annotator: the annotator is com.as.text_understanding.uima_annotators.pasta.FromDkproPastaAnnotator. It assumes the document is already annotated by some DKPro annotations, as specified in FromDkproPastaAnnotator's TypeCapability annotation. To get this annotations, the precondition annotators, specified in DemoPasta.PRECONDITION_ANNOTATORS suffice. (Note that these precondition annotators can be replaced by other DKpro annotators, like, for example, Stanford-Parser).

As for the results, please note that the Predicate annotation mostly covers also some arguments. The actual predicate can be retrieved by the field verb of Predicate.

An example program which runs PASTA as a UIMA annotator, from within UIMA's Document-Analyzer, is provided at com.as.text_understanding.uima_annotators.pasta.DemoPastaAnnotator.

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
log4j.properties		log4j.properties
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Text-Understanding

About

Usage

Embedding

Demo

API

Stand-alone

UIMA annotator

About

Releases 3

Packages

Languages

License

asher-stern/text-understanding

Folders and files

Latest commit

History

Repository files navigation

Text-Understanding

About

Usage

Embedding

Demo

API

Stand-alone

UIMA annotator

About

Resources

License

Stars

Watchers

Forks

Releases 3

Packages 0

Languages

Packages