Skip to content

Archived source code from Tan-Ning Huang for machine learning (Avg/CatBERT)

Notifications You must be signed in to change notification settings

kasys-lab/custom-bert-for-binary-classification

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

custom-bert-for-binary-classification

Custom BERT for binary classification is a Huggingface-based BERT for Japanese document classification. It aimed to address the problem of Transformer-based long document classification: the tranning result will decline due to the input limitation. Combining with our Kakuyomu-Clipper project, our modify BERT is adjusted in able to accompany our datasets.

Custom BERT

We will provide several customed BERT in the future.

  1. BERT-CLS-AVG model

Based on Sparsified Firstmatch/Nearest-K clipping method, our BERT-CLS-AVG model is able to train the data withe sparsified paragraphs from one document. The model will train on an averaged classification vectors from the batched dataset text.

Author

About

Archived source code from Tan-Ning Huang for machine learning (Avg/CatBERT)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 79.4%
  • Python 19.6%
  • Shell 1.0%