Skip to content
amirhmoin edited this page Jun 25, 2013 · 13 revisions

The Revision Control System (Source Code Repository)

The EOP source code is hosted on Github and managed using the Git revision control system. To download the source code, please use a Git client and clone the repository using the command below:

git clone https://github.com/hltfbk/Excitement-Open-Platform.git

The master branch contains the latest development code.

In order to commit to the repository, you should request write access permission by contacting the EOP developers via this mailing list: [email protected] Then, you should follow the policy mentioned below.

How to Import the Code to the Eclipse IDE and Use it for Development, Building and Running

@TODO: by WP4

Code Administration Policy

This section holds project code administration policy of EXCITEMENT open platform: things like what are accepted as contributions, and where the contributions should be included, and with whom you should contact for adding/improving codes.

Layout of EOP project sub-modules

Excitement open platform has various sub projects (sub-modules of Maven). For example, let's see EOP 1.0 sub-modules for example.

  • EOP
    • Common : Common module contains type systems and data structures that are needed by all submodules. The sub-module also includes all interface (EOP component and EDA APIs) definitions, too. If a data type (or type definition, like UIMA types), has to be shared by all of EOP submodules, that type must be defined in Common.
    • Core : Core modules contains EXCITEMENT components (like distance component, scoring component, lexical resources and syntactic resources). This is the natural place to put a reusable component implementation, that supports one of the defined EXCITEMENT component interface.
    • Lap : LAP (linguistic analysis pipeline) contains various linguistic annotation pipelines and related codes. This is the natural place to put a LAP pipeline code, that supports LAPAccess interface.
    • Gui : this module holds various demo codes and GUI related codes.
    • transformation : the module contains codes that are related to transformation of parse trees by using knowledge resources. For the moment, only used by BIUTEE, but it contains generic enough transformation tools that can be reused by other codes in the future.
    • biutee : this module holds codes related to the EDA "BIUTEE". (Each EDA will have its own sub module if it needs more than 5 non-component classes.)

You can expect additional Maven sub modules as the project grows. In any case, you will always see the "core sub-modules", that are core, LAP, and common (CLC).

What kind of code can be part of EXCITEMENT open platform?

Excitement Open Platform has a clear goal policy on inclusion of a code in the EOP project space. EOP project will include codes that implement the EXCITEMENT open specification: codes that are directly and indirectly used by components that implements EXCITEMENT specification. This means that generic NLP tools, as long as they are not related/used by a component, should not be included in the platform.

Practically, consult one of the Administrators and ask him/her about adding a module/component in EOP.

CLC (Core, LAP and Common): the core sub-modules

CLC is the heart of EOP. Classes in CLC can be utilized by any code within Excitement open platform.

  • Common: interfaces and common data types
  • Core: Excitement components
  • LAP: LAP (linguistic annotation pipeline) codes

For a code to be included in CLC, it has to be general. This can be further broken into naturally general and practically general. By "naturally general", it means that we need the code, and the platform will not work without the code (e.g. operating system utilities, common data structures). By "practically general", it means that the code is actually used by some components (components + EDAs), and can be used by other components, too. For example, the policy does not permit codes that "might be" useful in the future to be included in CLC (it is not practically general).

This might sound complicated :-) But practically, this means that you should consult one of the Administrators about appropriateness of your "new component" before you try to develop something for CLC modules. Essentially, the administrators have final say on including something on CLC or not.

Administrators

Administrators are the people who have the right to merge code contributions back to EOP upstream. Currently, there are five EOP administrators. Here's the list and their main role for the first release.

  • Asher Stern (Bar Ilan University, BIUTEE EDA and related codes)
  • Roberto Zanoli (FBK, EDITS EDA and EOP development environment (Mvn, GitHub, etc))
  • Rui Wang (DFKI, TIE EDA and German TE)
  • Tae-Gil Noh (Heidelberg University, German resources, LAP and UIMA type system)
  • Amir H. Moin (DFKI, distribution release management)

If you are a developer who belongs to one of the institutions, the person of your institute is your administrator. If you are an external developer (who does not belong to one of the above four institutes), please contact the administrator based on the topics of the administrators.

Note that administrators are responsible for the following two issues.

  • Appropriateness of the contribution: For example, the new code is well located in CLC, etc
  • Quality of the contribution: The code follows the coding standard and component policy of EXCITEMENT, as defined in the specification.

It is recommended that a developer should contact the administrator before starting the actual development of a new code module. By doing so, the appropriateness of the contribution can be checked with the intention and goal of the contribution, even before the actual development.

Improving existing codes: To whom should I discuss?

  • If you are improving something in CLC: coordinate this with one of the administrator. If the issue is not simple, the administrator will discuss the issue with other administrators, to make a concrete decision.
  • If you are improving something not in CLC, but in a module provided by EOP code base: you have to discuss this with the module owner. Each of the module has one or more owners, basically noted in the JavaDoc. Module owners can decline your propose. If module owner is unclear, simply contact one of the administrators.

In both cases, contacting an administrator as soon as possible and declare your intention and goal is a sound idea. This will make things much clearer for your contribution to be included in EOP.

Issue of License

  • Note that the Excitement Open Platform has "dual license" policy. Its main license is GPL v3, but all EOP contributions (codes that are stored in EXCITEMENT GitHub) can be released as LGPL. This dual license policy is to support industrial use cases where EOP has to be used with proprietary licensed code modules.
  • Each and every developer must accept this dual license policy. See [AcceptingLicensePolicy], for the information how you can express your acceptance. This must be done before merging any new contribution back to EOP.
  • If you code uses 3rd-party codes, please make sure you code is not breaking the license compatibility. (For example, linking a proprietary library in main release (GPL v3), can be done on your computer, but such code cannot be re-distributed to EOP main upstream). See [Platform license] for more detail.

Supported Entailment Decision Algorithms (EDAs):

Lexical Resources:

How to add a new Linguistic Analysis Pipeline (LAP)?

HowToAddANewLAP outlines the basic knowledge needed for this. It first introduces UIMA CAS, the adopted data representation for preprocessing (linguistic analyiss) result. And then the document describes how to add a new LAP module to the platform, by introducing LAPAccess interface and LAP_ImplBase class. @TODO: By Tae-Gil Noh, (ETA)

How to add a new Entailment Decision Algorithm (EDA)?

@TODO: How to add new algorithms to the existing ones