config fastner

Jump to bottom

Jianlin Shi edited this page Aug 23, 2017 · 1 revision

Token based rules

Each rule consists a list of elements, specifying the tokens to be matched. Token based rules support 4 wildcards:

\w+: represent any token, including a word or punctuation
> followed by a number X: represent any number greater than X
\< followed by a number X: represent any number smaller than X
\d+: reprsent any number.

Examples:

\w+ cough This rule will match "dry cough", "productive cough", etc.
temp \> 38 \<40 This rule will match "temp 38.1", "temp 39.5", etc.

Additionally \( and \) can be used to match a subset of a rule, for example:

dry ( cough This rule will match the word "cough" in the phrase of "dry cough"

Character based rules

The FastCNER support character based rules, which support following wildcards: ( Beginning of capturing a group ) End of capturing a group

\ plus following characters

p A punctuation
- An addition symbol (to distinguish the "+" after a wildcard)
( A left parentheses symbol
) A right parentheses symbol
d A digit
C A capital letter
c A lowercase letter
s A whitespace
a A Non-whitespace character
u A uncommon character: not a letter, not a number, not a punctuation, not a whitespace
n A return