Skip to content

Commit

Permalink
update docs
Browse files Browse the repository at this point in the history
  • Loading branch information
Spico197 committed Jan 4, 2022
1 parent 4b9898a commit ee31650
Show file tree
Hide file tree
Showing 9 changed files with 326 additions and 3 deletions.
17 changes: 17 additions & 0 deletions .readthedocs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Required
version: 2

# Build documentation in the docs/ directory with Sphinx
sphinx:
builder: html
configuration: docs/source/conf.py
fail_on_warning: false

formats:
- pdf

# Optionally set the version of Python and requirements required to build your docs
python:
version: 3.8
install:
- requirements: docs/requirements.txt
9 changes: 6 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# ❤️ A Toolkit for Document-level Event Extraction with & without Triggers

![Build](https://github.com/Spico197/DocEE/workflows/DocEE/badge.svg?branch=main)
![Coverage](https://img.shields.io/codecov/c/github/Spico197/DocEE)

[![Build](https://github.com/Spico197/DocEE/workflows/DocEE/badge.svg?branch=main)](https://github.com/Spico197/DocEE/actions/workflows/build.yml)
[![codecov](https://codecov.io/gh/Spico197/DocEE/branch/main/graph/badge.svg?token=4BQQN039YZ)](https://codecov.io/gh/Spico197/DocEE)
[![Documentation Status](https://readthedocs.org/projects/doc-ee/badge/?version=latest)](https://doc-ee.readthedocs.io/en/latest/?badge=latest)

<!-- [⚙️Installation](#️installation) | [🚀Quick Start](#quick-start) | [💾Data Preprocessing](#data-preprocessing) | [📋Reproduction](#reproduction)| [⚽Find Pseudo Triggers](#find-pseudo-triggers) | [📚Instructions](#instructions) | [🙋FAQ](#faq) | [📜Citation](#citation) | [🔑Licence](#licence) | [🤘Furthermore](#furthermore) -->

Expand Down Expand Up @@ -198,6 +198,9 @@ $ python trigger.py <max number of pseudo triggers>
- A: Such inference interface is provided in `dee/tasks/dee_task.py/DEETask.predict_one()` (**Convenient online serving interface**).
- Q: What is `o2o`, `o2m` and `m2m`?
- A: They are abbreviations for `one-type one-instance per doc`, `one-type with multiple instances per doc` and `multiple types per doc`.
- Q: I see lots of terms in `Exps/<task_name>/Output/dee_eval.(dev|test).(pred|gold)_span.<model_name>.<epoch>.json`, what are those mean?
- A: Please refer to the `Evluation` section of documents, or refer to [#7](https://github.com/Spico197/DocEE/issues/7).


## 📜Citation

Expand Down
20 changes: 20 additions & 0 deletions docs/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Minimal makefile for Sphinx documentation
#

# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
SOURCEDIR = source
BUILDDIR = build

# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

.PHONY: help Makefile

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
35 changes: 35 additions & 0 deletions docs/make.bat
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
@ECHO OFF

pushd %~dp0

REM Command file for Sphinx documentation

if "%SPHINXBUILD%" == "" (
set SPHINXBUILD=sphinx-build
)
set SOURCEDIR=source
set BUILDDIR=build

if "%1" == "" goto help

%SPHINXBUILD% >NUL 2>NUL
if errorlevel 9009 (
echo.
echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
echo.installed, then set the SPHINXBUILD environment variable to point
echo.to the full path of the 'sphinx-build' executable. Alternatively you
echo.may add the Sphinx directory to PATH.
echo.
echo.If you don't have Sphinx installed, grab it from
echo.https://www.sphinx-doc.org/
exit /b 1
)

%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
goto end

:help
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%

:end
popd
1 change: 1 addition & 0 deletions docs/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
sphinx>=4.3.2
52 changes: 52 additions & 0 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# Configuration file for the Sphinx documentation builder.
#
# This file only contains a selection of the most common options. For a full
# list see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html

# -- Path setup --------------------------------------------------------------

# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#
# import os
# import sys
# sys.path.insert(0, os.path.abspath('.'))


# -- Project information -----------------------------------------------------

project = 'DocEE'
copyright = '2022, Tong Zhu'
author = 'Tong Zhu'


# -- General configuration ---------------------------------------------------

# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = [
]

# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']

# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This pattern also affects html_static_path and html_extra_path.
exclude_patterns = []


# -- Options for HTML output -------------------------------------------------

# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
#
html_theme = 'alabaster'

# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']
70 changes: 70 additions & 0 deletions docs/source/evaluation.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
.. toctree::
:maxdepth: 2
:caption: Contents:


Evaluation
==========

:Authors:
Tong Zhu

:Last Update: Jan. 4th, 2021


You may wondering what are those terms in
``Exps/<task_name>/Output/dee_eval.(dev|test).(pred|gold)_span.<model_name>.<epoch>.json``.
Here are the explanation.

Doc Type
########

Document types are combined with the number of event types and the number of event instances per type.

o2o
There is only one event type with one instance.

o2m
There are only one event type with multiple instances.

m2m
There are multiple event types.

Metrics
#######

classification
The event type classification measurements.

entity
The Named Entity Recognition (NER) part of measurements.

overall
The final metric with role-level evaluation as introduced in Doc2EDAG [#Doc2EDAG]_.

instance
The instance-level measurements.
One instance is recognised as True Positive (TP) iff all the argument roles have filled with correct arguments.

trigger
For PTPCG, ``trigger`` means the evaluation of pseudo triggers.

adj_mat
For PTPCG, ``adj_mat`` means the evaluation of adjacent matrix for each document.

connection
For PTPCG, ``connection`` means the evaluation of connections between pseudo triggers and ordinary arguments.

rawCombination
In PTPCG, ``rawCombination`` is the combination evaluation results
after the BK extraction without further instance generation and argument filtering.

combination
``combination`` is the combination evaluation results
after the final instance generation process.
Some arguments in ``rawCombination`` may be filtered out.

References
##########

.. [#Doc2EDAG] Shun Zheng, Wei Cao, Wei Xu, and Jiang Bian. 2020. Doc2EDAG: An end-to-end document-level framework for Chinese financial event extraction. EMNLP-IJCNLP 2019 - 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference:337–346.
47 changes: 47 additions & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
.. DocEE documentation master file, created by
sphinx-quickstart on Tue Jan 4 10:02:33 2022.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
Welcome to DocEE's documentation!
=================================

.. image:: https://github.com/Spico197/DocEE/workflows/DocEE/badge.svg?branch=main
:target: https://github.com/Spico197/DocEE/actions/workflows/build.yml
:alt: Building Status
.. image:: https://codecov.io/gh/Spico197/DocEE/branch/main/graph/badge.svg?token=4BQQN039YZ
:target: https://codecov.io/gh/Spico197/DocEE
:alt: Code Coverage Status
.. image:: https://readthedocs.org/projects/doc-ee/badge/?version=latest
:target: https://doc-ee.readthedocs.io/en/latest/?badge=latest
:alt: Documentation Status

Hi, there 👋. Thanks for your stay in `this repo <https://github.com/Spico197/DocEE>`_.

This project aims at building a universal toolkit for extracting events
automatically from documents 📄 (long texts).

🔥 We have an online demo: http://hlt.suda.edu.cn/docee (available in 9:00-17:00 UTC+8).

Currently, this repo contains ``PTPCG``, ``Doc2EDAG`` and ``GIT`` models,
and these models are all designed for document-level event extraction without triggers.
Documents are under construction.
More models are planned to be added.

Issue, PR and document contributions are warmly welcomed to make this project nicer.


.. toctree::
:maxdepth: 2
:caption: Contents:

terminology
evaluation


Indices and tables
==================

* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`
78 changes: 78 additions & 0 deletions docs/source/terminology.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
.. toctree::
:maxdepth: 2
:caption: Contents:


Terminology
===========


:Authors:
Tong Zhu

:Last Update: Jan. 4th, 2021


Contents
########

Event Instance
An event instance is a single event in a table format.
The table includes event type and several argument roles together with corresponding arguments.
The example is shown as below:

Tom *bought* 2 pounds of flour at Pinshihui for $5 per pound last night.

+----------+-------------------+
| Event Type: Buy |
+==========+===================+
| Buyer | Tom |
+----------+-------------------+
| Object | 2 pounds of flour |
+----------+-------------------+
| Price | $5 per pound |
+----------+-------------------+
| Time | last night |
+----------+-------------------+
| Location | Pinshihui |
+----------+-------------------+
| Cashier | N/A |
+----------+-------------------+

Trigger
Refering the annotation guide of ACE05 [#ace05]_, event trigger is
the word that most clearly expresses event's occurrence.
For instance, the trigger word of the example above is *bought*.

Argument Role
Argument roles are event participants' types.
For instance, *Buyer*, *Object*, *Price*, *Time*, *Location* and *Cashier* are argument roles.
These roles are pre-defined together with event types.
Each event type correspondes to a specific event template table.

Argument
Arguments are participants to corresponding roles.
Arguments can be absent if the context cannot provide the information.
For example, we don't know who is the cashier when Tom bought flour last night,
so here the argument to *Cashier* role is N/A.

Combination
Argument combinations are ``set`` without inner argument orders.
For example, the combination of the above example is ``{last night, Tom, $5 per pound, 2 pounds of flour, Pinshihui}``.
N/A is not included in combinations.

Entity & Mention
Entities are basic elements of objects.
For example, ``Tom`` is a ``PERSON`` entity.
One entity may have multiple mentions, and a mention could be
an occurrence in the raw text, or a pronoun refering to the same entity.

Span
Span indicates the positions ``[sentence idx, start char idx, end char idx + 1]``.
For instance, ``[0, 1, 3]`` refers to ``bought 2`` if we apply space tokenisation.


References
##########

.. [#ace05] https://www.ldc.upenn.edu/sites/www.ldc.upenn.edu/files/english-events-guidelines-v5.4.3.pdf

0 comments on commit ee31650

Please sign in to comment.