Skip to content

Latest commit

 

History

History
16 lines (15 loc) · 995 Bytes

File metadata and controls

16 lines (15 loc) · 995 Bytes

main usage: python -m distilbert_baseline --data /path/to/your.jsonl --mode train_or_test

Arguments:

  • -h or --help: show a help message and exit
  • --data DATA: Path to .jsonl file containing dataset
  • --mode MODE: which testing mode: 'train' or 'test'
  • --use_class_weights USE_CLASS_WEIGHTS: use balanced class weights for training, default True
  • --seed SEED: seed for random number generation, default 0xDEADBEEF
  • --max_seq_len MAX_SEQ_LEN: maximum sequence length in transformer, default 128
  • --batch_size BATCH_SIZE: training batch size, default 8
  • --epochs EPOCHS: number of epochs of training, default 1
  • --weight_decay WEIGHT_DECAY: AdamW weight decay, default 0.0
  • --learning_rate LEARNING_RATE: AdamW learning rate, default 5e-5
  • --adam_epsilon ADAM_EPSILON: AdamW epsilon, default 1e-8
  • --warmup_steps WARMUP_STEPS: Warmup steps for learning rate schedule, default 0
  • --max_grad_norm MAX_GRAD_NORM: max norm for gradient clipping, default 1.0