We release 46 finetuned models on transformers model hub. All the models are XLM-R, finetuned on named entity recognition task with T-NER. Please take a look our paper for the evaluation results including out-of-domain accuracy (paper link).
Model name is organized as asahi417/tner-xlm-roberta-{model_type}-{dataset}
, where model_type
is either base
or large
and dataset
corresponds to
the alias of dataset. In addition to each individual model, we train on the English merged dataset by
concatenating all the English NER dataset, that denoted as all-english
.
We also release model finetuned on lowercased dataset, which is called asahi417/tner-xlm-roberta-{model_type}-uncased-{dataset}
.
For example
asahi417/tner-xlm-roberta-large-ontonotes5
: XLM-R large model finetuned on Ontonotes5 datasetasahi417/tner-xlm-roberta-base-uncased-conll2003
: XLM-R base model finetuned on lowercased CoNLL2003 datasetasahi417/tner-xlm-roberta-large-all-english
: XLM-R large model finetuned on all English datasets
The list of all public models can be checked here.
The training parameter used in TNER to finetune each model, is stored at https://huggingface.co/{model-name}/blob/main/parameter.json
.
Eg) The training parameter of asahi417/tner-xlm-roberta-large-all-english
is here.
import tner
model = tner.TransformersNER("model-name")
from transformers import AutoTokenizer, AutoModelForTokenClassification
tokenizer = AutoTokenizer.from_pretrained("model-name")
model = AutoModelForTokenClassification.from_pretrained("model-name")