Skip to content

Latest commit

 

History

History
35 lines (28 loc) · 2.05 KB

MODEL_CARD.md

File metadata and controls

35 lines (28 loc) · 2.05 KB

Released Model

We release 46 finetuned models on transformers model hub. All the models are XLM-R, finetuned on named entity recognition task with T-NER. Please take a look our paper for the evaluation results including out-of-domain accuracy (paper link).

Model Name

Model name is organized as asahi417/tner-xlm-roberta-{model_type}-{dataset}, where model_type is either base or large and dataset corresponds to the alias of dataset. In addition to each individual model, we train on the English merged dataset by concatenating all the English NER dataset, that denoted as all-english. We also release model finetuned on lowercased dataset, which is called asahi417/tner-xlm-roberta-{model_type}-uncased-{dataset}.

For example

  • asahi417/tner-xlm-roberta-large-ontonotes5: XLM-R large model finetuned on Ontonotes5 dataset
  • asahi417/tner-xlm-roberta-base-uncased-conll2003: XLM-R base model finetuned on lowercased CoNLL2003 dataset
  • asahi417/tner-xlm-roberta-large-all-english: XLM-R large model finetuned on all English datasets

The list of all public models can be checked here. The training parameter used in TNER to finetune each model, is stored at https://huggingface.co/{model-name}/blob/main/parameter.json. Eg) The training parameter of asahi417/tner-xlm-roberta-large-all-english is here.

Usage

To use with TNER

import tner
model = tner.TransformersNER("model-name")

To use with transformers

from transformers import AutoTokenizer, AutoModelForTokenClassification
tokenizer = AutoTokenizer.from_pretrained("model-name")
model = AutoModelForTokenClassification.from_pretrained("model-name")