Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
app.py	app.py
requirements.txt	requirements.txt

Matcher

Check token patterns for spaCy's rule-based Matcher against a text and return the matches in the text, as well as the individual tokens and whether they're part of a match. For a usage example, see the Rule-based Matcher Explorer demo.

Installation

pip install -r requirements.txt
python app.py

API

GET `/models`

Get a list of available models and their human-readable name, keyed by model name.

Example response

{
    "en_core_web_sm": "English - en_core_web_sm (v2.0.0)"
}

POST `/match`

Match a pattern and return the matches and tokens.

Example request

{
    "text": "A match is a tool for starting a fire. Typically, modern matches are made of small wooden sticks or stiff paper. ",
    "model":"en_core_web_sm",
    "pattern": [
        {
            "POS": "ADJ",
            "OP": "?"
        },
        {
            "LEMMA": "match",
            "POS": "NOUN"
        },
        {
            "LEMMA": "be"
        }
    ]
}

Name	Type	Description
`text`	string	The text to match on.
`model`	string	The statistical model to use for tokenization.
`pattern`	list	The token pattern to match. Each object in the list describes one token and is keyed by token attributes.

Example response

{
    "matches": [
        {
            "start": 2,
            "end": 10,
            "label": "MATCH"
        },
        {
            "start": 50,
            "end": 68,
            "label": "MATCH"
        }
    ],
    "tokens": [
        {
            "start": 0,
            "end": 1,
            "label": "TOKEN"
        },
        {
            "start": 2,
            "end": 7,
            "label": "MATCH"
        },
        {
            "start": 8,
            "end": 10,
            "label": "MATCH"
        },
        {
            "start": 11,
            "end": 12,
            "label": "TOKEN"
        },
        {
            "start": 13,
            "end": 17,
            "label": "TOKEN"
        },
        {
            "start": 18,
            "end": 21,
            "label": "TOKEN"
        },
        {
            "start": 22,
            "end": 30,
            "label": "TOKEN"
        },
        {
            "start": 31,
            "end": 32,
            "label": "TOKEN"
        },
        {
            "start": 33,
            "end": 37,
            "label": "TOKEN"
        },
        {
            "start": 37,
            "end": 38,
            "label": "TOKEN"
        },
        {
            "start": 39,
            "end": 48,
            "label": "TOKEN"
        },
        {
            "start": 48,
            "end": 49,
            "label": "TOKEN"
        },
        {
            "start": 50,
            "end": 56,
            "label": "MATCH"
        },
        {
            "start": 57,
            "end": 64,
            "label": "MATCH"
        },
        {
            "start": 65,
            "end": 68,
            "label": "MATCH"
        },
        {
            "start": 69,
            "end": 73,
            "label": "TOKEN"
        },
        {
            "start": 74,
            "end": 76,
            "label": "TOKEN"
        },
        {
            "start": 77,
            "end": 82,
            "label": "TOKEN"
        },
        {
            "start": 83,
            "end": 89,
            "label": "TOKEN"
        },
        {
            "start": 90,
            "end": 96,
            "label": "TOKEN"
        },
        {
            "start": 97,
            "end": 99,
            "label": "TOKEN"
        },
        {
            "start": 100,
            "end": 105,
            "label": "TOKEN"
        },
        {
            "start": 106,
            "end": 111,
            "label": "TOKEN"
        },
        {
            "start": 111,
            "end": 112,
            "label": "TOKEN"
        }
    ]
}

Name	Type	Description
`matches`	list	The matches in the text.
`tokens`	list	The individual tokens in the text and whether they're part of a match.
`start`	number	Character offset the match or token starts on.
`end`	number	Character offset the match or token ends after.
`label`	string	`"MATCH"` for matched span, `"TOKEN"` for token span.

Usage Example (JavaScript)

function getMatches(text, model, pattern) {
    const options = {
        method: 'POST',
        headers: { 'Accept': 'application/json', 'Content-Type': 'application/json' },
        credentials: 'same-origin',
        body: JSON.stringify({ text, model, pattern })
    };
    fetch('/match', options)
        .then(res => res.json())
        .then(({ tokens, matches }) => {
            console.log(tokens, matches);
        });
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

matcher

matcher

README.md

Matcher

Installation

API

GET `/models`

Example response

POST `/match`

Example request

Example response

Usage Example (JavaScript)

Files

matcher

Directory actions

More options

Directory actions

More options

Latest commit

History

matcher

Folders and files

parent directory

README.md

Matcher

Installation

API

GET /models

Example response

POST /match

Example request

Example response

Usage Example (JavaScript)

GET `/models`

POST `/match`