Model | References | Link |
VietAI/gpt-j-6B-vietnamese-news |
🤗VietAI/gpt-j-6B-vietnamese-news |
VietAI/gpt-neo-1.3B-vietnamese-news |
🤗VietAI/gpt-neo-1.3B-vietnamese-news |
imthanhlv/gpt2news |
🤗imthanhlv/gpt2news |
Model | References | Link |
VinAIResearch/BARTpho | Tran et al. arXiv preprint'21 |
🤗vinai/bartpho-syllable 🤗 vinai/vinai/bartpho-word 💻 VinAIResearch/BARTpho
fpt-corp/viBERT | Bui et al. PACLIC'20 |
🤗FPTAI/vibert-base-cased 💻 fpt-corp/viBERT
fpt-corp/vELECTRA | Bui et al. PACLIC'20 |
🤗FPTAI/velectra-base-discriminator-cased 💻 fpt-corp/viBERT
VinAIResearch/PhoBERT | Nguyen et al. EMNLP Findings'20 |
🤗vinai/phobert-base 🤗 vinai/phobert-large 💻 VinAIResearch/PhoBERT
NlpHUST/vibert4news |
🤗NlpHUST/vibert4news-base-cased 💻 bino282/bert4news
nguyenvulebinh/vietnamese-electra |
imthanhlv/imthanhlv/t5vi |
Model Descriptions
Model | #Params | Training Data | Domain | Tokenization | Vocab Size |
VinAIResearch/BARTpho |
396M (bartpho-syllable) 420M (bartpho-word) |
20GB | News | Word (bartpho-word) Syllable (bartpho-syllable) |
64000 |
fpt-corp/viBERT | 10GB | News | Subword | 38168 | |
VinAIResearch/PhoBERT |
135M (phobert-base) 370M (phobert-large) |
20GB | News | Word | 64000 |
NlpHUST/vibert4news | 20GB | News | Syllable | 62000 |
- datquocnguyen/PhoW2V - Pre-trained Word2Vec syllable and word embeddings for Vietnamese
- vietnlp/etnlp - A toolkit to evaluate, extract, and visualize multiple embeddings
- Kyubyong/wordvectors
- facebookresearch/fastText
- sonvx/word2vecVN
ViCon comprises pairs of synonyms and antonyms across word classes, thus offering data to distinguish between similarity and dissimilarity. ViSim-400 provides degrees of similarity across five semantic relations, as rated by human judges.
The two datasets are verified through standard co-occurrence and neural network models, showing results comparable to the respective English datasets
📜 Papers