[English] [Vietnamese]
This repository contains starter code for training and evaluating machine learning models in Vietnamese Named Entity Recognition problem. It is a part of underthesea project. The code gives an end-to-end working example for reading datasets, training machine learning models, and evaluating performance of the models. It can easily be extended to train your own custom-defined models.
This code is writen in python. The dependencies are:
Operating Systems: Linux (Ubuntu, CentOS), Mac
Python 3.6
Anaconda
Python Packages
underthesea==1.1.7
languageflow==1.1.7
Clone project using git
$ git clone https://github.com/undertheseanlp/ner.git
Create environment and install requirements
$ cd ner
$ conda create -n uts.ner python=3.5
$ pip install -r requirements.txt
cd ner
$ source activate ner
$ python ner.py -fin tmp/input.txt -fout tmp/output.txt
Prepare a new dataset
Train and test
$ cd ner
$ source activate ner
$ python train.py
--train data/vlsp2018/corpus/train.txt
To be updated
To be updated
Last update: 07/2018