Skip to content

Latest commit

 

History

History
10 lines (7 loc) · 850 Bytes

README.md

File metadata and controls

10 lines (7 loc) · 850 Bytes

Fine-tuning of 'DistillBert' NLP transformer model for text classification, with 'AdaFamily' optimizer and 'mini-batch trimming'

Demonstrates finetuning of a NLP model with novel 'AdaFamily' optimizer and 'mini-batch trimming'. Code is taken and adapted from https://github.com/hfwittmann/transformer_finetuning_lightning Uses pytorch lightning. Demonstrates also how to modify (via 'monkey-patching') a huggingface transformer model so that it employs a custom loss function.

Regarding 'mini-batch trimming' (curriculum learning method), see my arxiv preprint at https://arxiv.org/abs/2110.13058 (accepted for ASPAI 2021 conference)

Regarding 'AdaFamily' (a family of novel adaptive gradient methods), see my arxiv preprint at https://arxiv.org/abs/2203.01603 (accepted for ISPR 2022 conference). We use the AdaFamily variant with myu = 0.25