-
Notifications
You must be signed in to change notification settings - Fork 37
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
9 changed files
with
326 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
# Required | ||
version: 2 | ||
|
||
# Build documentation in the docs/ directory with Sphinx | ||
sphinx: | ||
builder: html | ||
configuration: docs/source/conf.py | ||
fail_on_warning: false | ||
|
||
formats: | ||
|
||
# Optionally set the version of Python and requirements required to build your docs | ||
python: | ||
version: 3.8 | ||
install: | ||
- requirements: docs/requirements.txt |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
# Minimal makefile for Sphinx documentation | ||
# | ||
|
||
# You can set these variables from the command line, and also | ||
# from the environment for the first two. | ||
SPHINXOPTS ?= | ||
SPHINXBUILD ?= sphinx-build | ||
SOURCEDIR = source | ||
BUILDDIR = build | ||
|
||
# Put it first so that "make" without argument is like "make help". | ||
help: | ||
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) | ||
|
||
.PHONY: help Makefile | ||
|
||
# Catch-all target: route all unknown targets to Sphinx using the new | ||
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS). | ||
%: Makefile | ||
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
@ECHO OFF | ||
|
||
pushd %~dp0 | ||
|
||
REM Command file for Sphinx documentation | ||
|
||
if "%SPHINXBUILD%" == "" ( | ||
set SPHINXBUILD=sphinx-build | ||
) | ||
set SOURCEDIR=source | ||
set BUILDDIR=build | ||
|
||
if "%1" == "" goto help | ||
|
||
%SPHINXBUILD% >NUL 2>NUL | ||
if errorlevel 9009 ( | ||
echo. | ||
echo.The 'sphinx-build' command was not found. Make sure you have Sphinx | ||
echo.installed, then set the SPHINXBUILD environment variable to point | ||
echo.to the full path of the 'sphinx-build' executable. Alternatively you | ||
echo.may add the Sphinx directory to PATH. | ||
echo. | ||
echo.If you don't have Sphinx installed, grab it from | ||
echo.https://www.sphinx-doc.org/ | ||
exit /b 1 | ||
) | ||
|
||
%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O% | ||
goto end | ||
|
||
:help | ||
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O% | ||
|
||
:end | ||
popd |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
sphinx>=4.3.2 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
# Configuration file for the Sphinx documentation builder. | ||
# | ||
# This file only contains a selection of the most common options. For a full | ||
# list see the documentation: | ||
# https://www.sphinx-doc.org/en/master/usage/configuration.html | ||
|
||
# -- Path setup -------------------------------------------------------------- | ||
|
||
# If extensions (or modules to document with autodoc) are in another directory, | ||
# add these directories to sys.path here. If the directory is relative to the | ||
# documentation root, use os.path.abspath to make it absolute, like shown here. | ||
# | ||
# import os | ||
# import sys | ||
# sys.path.insert(0, os.path.abspath('.')) | ||
|
||
|
||
# -- Project information ----------------------------------------------------- | ||
|
||
project = 'DocEE' | ||
copyright = '2022, Tong Zhu' | ||
author = 'Tong Zhu' | ||
|
||
|
||
# -- General configuration --------------------------------------------------- | ||
|
||
# Add any Sphinx extension module names here, as strings. They can be | ||
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom | ||
# ones. | ||
extensions = [ | ||
] | ||
|
||
# Add any paths that contain templates here, relative to this directory. | ||
templates_path = ['_templates'] | ||
|
||
# List of patterns, relative to source directory, that match files and | ||
# directories to ignore when looking for source files. | ||
# This pattern also affects html_static_path and html_extra_path. | ||
exclude_patterns = [] | ||
|
||
|
||
# -- Options for HTML output ------------------------------------------------- | ||
|
||
# The theme to use for HTML and HTML Help pages. See the documentation for | ||
# a list of builtin themes. | ||
# | ||
html_theme = 'alabaster' | ||
|
||
# Add any paths that contain custom static files (such as style sheets) here, | ||
# relative to this directory. They are copied after the builtin static files, | ||
# so a file named "default.css" will overwrite the builtin "default.css". | ||
html_static_path = ['_static'] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,70 @@ | ||
.. toctree:: | ||
:maxdepth: 2 | ||
:caption: Contents: | ||
|
||
|
||
Evaluation | ||
========== | ||
|
||
:Authors: | ||
Tong Zhu | ||
|
||
:Last Update: Jan. 4th, 2021 | ||
|
||
|
||
You may wondering what are those terms in | ||
``Exps/<task_name>/Output/dee_eval.(dev|test).(pred|gold)_span.<model_name>.<epoch>.json``. | ||
Here are the explanation. | ||
|
||
Doc Type | ||
######## | ||
|
||
Document types are combined with the number of event types and the number of event instances per type. | ||
|
||
o2o | ||
There is only one event type with one instance. | ||
|
||
o2m | ||
There are only one event type with multiple instances. | ||
|
||
m2m | ||
There are multiple event types. | ||
|
||
Metrics | ||
####### | ||
|
||
classification | ||
The event type classification measurements. | ||
|
||
entity | ||
The Named Entity Recognition (NER) part of measurements. | ||
|
||
overall | ||
The final metric with role-level evaluation as introduced in Doc2EDAG [#Doc2EDAG]_. | ||
|
||
instance | ||
The instance-level measurements. | ||
One instance is recognised as True Positive (TP) iff all the argument roles have filled with correct arguments. | ||
|
||
trigger | ||
For PTPCG, ``trigger`` means the evaluation of pseudo triggers. | ||
|
||
adj_mat | ||
For PTPCG, ``adj_mat`` means the evaluation of adjacent matrix for each document. | ||
|
||
connection | ||
For PTPCG, ``connection`` means the evaluation of connections between pseudo triggers and ordinary arguments. | ||
|
||
rawCombination | ||
In PTPCG, ``rawCombination`` is the combination evaluation results | ||
after the BK extraction without further instance generation and argument filtering. | ||
|
||
combination | ||
``combination`` is the combination evaluation results | ||
after the final instance generation process. | ||
Some arguments in ``rawCombination`` may be filtered out. | ||
|
||
References | ||
########## | ||
|
||
.. [#Doc2EDAG] Shun Zheng, Wei Cao, Wei Xu, and Jiang Bian. 2020. Doc2EDAG: An end-to-end document-level framework for Chinese financial event extraction. EMNLP-IJCNLP 2019 - 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference:337–346. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
.. DocEE documentation master file, created by | ||
sphinx-quickstart on Tue Jan 4 10:02:33 2022. | ||
You can adapt this file completely to your liking, but it should at least | ||
contain the root `toctree` directive. | ||
Welcome to DocEE's documentation! | ||
================================= | ||
|
||
.. image:: https://github.com/Spico197/DocEE/workflows/DocEE/badge.svg?branch=main | ||
:target: https://github.com/Spico197/DocEE/actions/workflows/build.yml | ||
:alt: Building Status | ||
.. image:: https://codecov.io/gh/Spico197/DocEE/branch/main/graph/badge.svg?token=4BQQN039YZ | ||
:target: https://codecov.io/gh/Spico197/DocEE | ||
:alt: Code Coverage Status | ||
.. image:: https://readthedocs.org/projects/doc-ee/badge/?version=latest | ||
:target: https://doc-ee.readthedocs.io/en/latest/?badge=latest | ||
:alt: Documentation Status | ||
|
||
Hi, there 👋. Thanks for your stay in `this repo <https://github.com/Spico197/DocEE>`_. | ||
|
||
This project aims at building a universal toolkit for extracting events | ||
automatically from documents 📄 (long texts). | ||
|
||
🔥 We have an online demo: http://hlt.suda.edu.cn/docee (available in 9:00-17:00 UTC+8). | ||
|
||
Currently, this repo contains ``PTPCG``, ``Doc2EDAG`` and ``GIT`` models, | ||
and these models are all designed for document-level event extraction without triggers. | ||
Documents are under construction. | ||
More models are planned to be added. | ||
|
||
Issue, PR and document contributions are warmly welcomed to make this project nicer. | ||
|
||
|
||
.. toctree:: | ||
:maxdepth: 2 | ||
:caption: Contents: | ||
|
||
terminology | ||
evaluation | ||
|
||
|
||
Indices and tables | ||
================== | ||
|
||
* :ref:`genindex` | ||
* :ref:`modindex` | ||
* :ref:`search` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,78 @@ | ||
.. toctree:: | ||
:maxdepth: 2 | ||
:caption: Contents: | ||
|
||
|
||
Terminology | ||
=========== | ||
|
||
|
||
:Authors: | ||
Tong Zhu | ||
|
||
:Last Update: Jan. 4th, 2021 | ||
|
||
|
||
Contents | ||
######## | ||
|
||
Event Instance | ||
An event instance is a single event in a table format. | ||
The table includes event type and several argument roles together with corresponding arguments. | ||
The example is shown as below: | ||
|
||
Tom *bought* 2 pounds of flour at Pinshihui for $5 per pound last night. | ||
|
||
+----------+-------------------+ | ||
| Event Type: Buy | | ||
+==========+===================+ | ||
| Buyer | Tom | | ||
+----------+-------------------+ | ||
| Object | 2 pounds of flour | | ||
+----------+-------------------+ | ||
| Price | $5 per pound | | ||
+----------+-------------------+ | ||
| Time | last night | | ||
+----------+-------------------+ | ||
| Location | Pinshihui | | ||
+----------+-------------------+ | ||
| Cashier | N/A | | ||
+----------+-------------------+ | ||
|
||
Trigger | ||
Refering the annotation guide of ACE05 [#ace05]_, event trigger is | ||
the word that most clearly expresses event's occurrence. | ||
For instance, the trigger word of the example above is *bought*. | ||
|
||
Argument Role | ||
Argument roles are event participants' types. | ||
For instance, *Buyer*, *Object*, *Price*, *Time*, *Location* and *Cashier* are argument roles. | ||
These roles are pre-defined together with event types. | ||
Each event type correspondes to a specific event template table. | ||
|
||
Argument | ||
Arguments are participants to corresponding roles. | ||
Arguments can be absent if the context cannot provide the information. | ||
For example, we don't know who is the cashier when Tom bought flour last night, | ||
so here the argument to *Cashier* role is N/A. | ||
|
||
Combination | ||
Argument combinations are ``set`` without inner argument orders. | ||
For example, the combination of the above example is ``{last night, Tom, $5 per pound, 2 pounds of flour, Pinshihui}``. | ||
N/A is not included in combinations. | ||
|
||
Entity & Mention | ||
Entities are basic elements of objects. | ||
For example, ``Tom`` is a ``PERSON`` entity. | ||
One entity may have multiple mentions, and a mention could be | ||
an occurrence in the raw text, or a pronoun refering to the same entity. | ||
|
||
Span | ||
Span indicates the positions ``[sentence idx, start char idx, end char idx + 1]``. | ||
For instance, ``[0, 1, 3]`` refers to ``bought 2`` if we apply space tokenisation. | ||
|
||
|
||
References | ||
########## | ||
|
||
.. [#ace05] https://www.ldc.upenn.edu/sites/www.ldc.upenn.edu/files/english-events-guidelines-v5.4.3.pdf |