It is easy for a person to understand the meaning of a word from a sentence, but it is difficult for a machine, especially in the Mongolian language, where there is little natural language processing and research. Therefore, the goal of the competition is to create a model for guessing the meaning of the different words in a given sentence.
Translated by Google
- Used as local module 1st place solution notebook
- Custom dataset code
dataset.py
- Custom model code
models.py
- Custom preprocessing code
preprocessing.py
- Custom dataset code
- HuggingFace Hub API Integration
train.py
- Faster and stardardized save&load loop
- Runtime model git-like version control (Git-LFS)
- Kaggle API Integration
train.py
- Submit the result to kaggle in runtime
- Python 3.6+
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
Configure huggingface model hub credential
huggingface-cli login
Configure the kaggle credential.
cp /path/to/downloaded/kaggle.json ~/.kaggle/
Start training
bash train.sh
Model | Type | Parameters(M) | CV* | Public LB | Private LB |
---|---|---|---|---|---|
albert-base | uncased | 11 | 0.951 | 0.95290 | 0.95759 |
gpt-2 | uncased | 117 | 0.943 | 0.93892 | 0.94350 |
roberta-large | uncased | 335 | 0.958 | 0.95924 | 0.96323 |
bert-base | uncased | 110 | 0.970 | 0.96699 | 0.97098 |
bert-large | cased | 335 | 0.971 | 0.96805 | 0.97310 |
bert-large | uncased | 335 | 0.971 | 0.97016 | 0.97251 |
*CV - 5-fold Cross Validation
- MLUB-МУИС-Сорил https://www.kaggle.com/c/muis-challenge
- 1-р байрны шийдэл https://www.kaggle.com/bayartsogtya/mlub-muis-soril-1
- Шийдлийн кодыг агуулсан Github repository https://github.com/bayartsogt-ya/mlub-muis-soril
- Mongolian BERT https://github.com/tugstugi/mongolian-bert
- HuggingFace дээрх Mongolian BERT https://huggingface.co/tugstugi
- ALBERT-Mongolian https://huggingface.co/bayartsogt/albert-mongolian
- Mongolian GPT2 https://huggingface.co/bayartsogt/mongolian-gpt2
- Mongolian RoBERTa Large https://huggingface.co/bayartsogt/mongolian-roberta-large
- Transformers Multiple Choice Model https://huggingface.co/transformers/model_doc/auto.html#automodelformultiplechoice
- Stratified KFold Cross Validation https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.StratifiedKFold.html