-
Notifications
You must be signed in to change notification settings - Fork 74
Code Structure
EOP has been developed as a Maven multi-modules project whereas all the modules share the same Maven standard location structure making it easier to find files in a project once one is used to Maven.
EOP has been developed as a Maven multi-modules project that is a particular type of project composed of several other projects known as modules. When you run a command on the project, it will execute it on all of its children projects. Even better, Maven is able to discover the correct execution order and to detect circular dependencies. The main modules in EOP are:
-
eop
- pom.xml XML file that contains information about the project and configuration details used by Maven to build the EOP project.
- common module contains type systems and data structures that are needed by all submodules. The sub-module also includes all interface (EOP component and EDA APIs) definitions, too. If a data type (or type definition, like UIMA types), has to be shared by all of EOP submodules, that type must be defined in common.
- core modules contains EXCITEMENT components (like distance component, scoring component, lexical resources and syntactic resources). This is the natural place to put a reusable component implementation, that supports one of the defined EXCITEMENT component interface.
- lap (linguistic analysis pipeline) contains various linguistic annotation pipelines and related codes. This is the natural place to put a LAP pipeline code, that supports LAPAccess interface.
- util module holds various common utility classes like Runner being able to run an EDA with the needed data preprocessing or EDAScorer used to evaluate the accuracy of EDA themselves.
- transformation module contains codes that are related to transformation of parse trees by using knowledge resources. For the moment, only used by BIUTEE, but it contains generic enough transformation tools that can be reused by other codes in the future.
- biutee module holds codes related to the EDA "BIUTEE". (Each EDA will have its own sub module if it needs more than 5 non-component classes).
You can expect additional Maven sub modules as the project grows. In any case, you will always see the "core sub-modules", that are core, lap, and common (CLC). CLC is the heart of EOP. Classes in CLC can be utilized by any code within EOP.
- Common: interfaces and common data types
- Core: Excitement components
- LAP: LAP (linguistic annotation pipeline) codes
For a code to be included in CLC, it has to be general. This can be further broken into naturally general and practically general. By "naturally general", it means that we need the code, and the platform will not work without the code (e.g. operating system utilities, common data structures). By "practically general", it means that the code is actually used by some components (components + EDAs), and can be used by other components, too. For example, the policy does not permit codes that "might be" useful in the future to be included in CLC (it is not practically general).
As a default, we keep each EDA as independent module that depends on CLC. When the number of classes for a specific EDA is too small (less than 5), we can put it in the core.
This might sound complicated. But practically, this means that you should consult one of the Administrators about appropriateness of your "new component" before you try to develop something for CLC modules. Essentially, the administrators have final say on including something on CLC or not.
One of the most important advantages using Maven is that there is a standard location structure for all of the files of the project. This means that Maven projects generally have a similar directory layout that makes it easy to find files in a project once one is used to it. Another advantage is that the different tools integrated with Maven knows that structure: for example the Java compiler knows that the source files are in src/main/java and that the resulted files have to be put into target/classes. Basically each of the Maven multi-modules in EOP (e.g. lap, core) is a single Maven project and share with the other ones the same Maven directory structure:
-
src/ directory contains all of the code and resource files that you will use for developing the project.
-
src/main/ directory contains all of the code and resources needed for building the artifact.
- src/main/java/ directory contains all of the bundle source code for building the artifact.
- src/main/resources/ directory contains the resources needed for building the artifact. Configuration files, data files, or Java properties to include in the bundle should be placed under this directory. These files will be copied into the root of the JAR file that is generated by the Maven build process.
-
src/test/ directory contains all of the code and resources for running unit tests against the compiled artifact.
- src/test/java/ directory contains the unit test source code.
- src/test/resources/ The files under this directory are used only during the testing phase and will not be copied into the generated JAR file.
-
src/main/ directory contains all of the code and resources needed for building the artifact.
-
target/ directory contains the result of the build (e.g. JAR files).
Multi-modules contain in addition a Project Object Model or POM (i.e. pom.xml) that is the fundamental unit of work in Maven. It is an XML file that contains information about the project and configuration details used by Maven to build the project.