-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training data and scripts used for wmt22-cometkiwi-da #217
Comments
To train
Your configs should be something like this: unified_metric:
class_path: comet.models.UnifiedMetric
init_args:
nr_frozen_epochs: 0.3
keep_embeddings_frozen: True
optimizer: AdamW
encoder_learning_rate: 1.0e-06
learning_rate: 1.5e-05
layerwise_decay: 0.95
encoder_model: XLM-RoBERTa
pretrained_model: microsoft/infoxlm-large
sent_layer: mix
layer_transformation: sparsemax
word_layer: 24
loss: mse
dropout: 0.1
batch_size: 16
train_data:
- TRAIN_DATA.csv
validation_data:
- VALIDATION_DATA.csv
hidden_sizes:
- 3072
- 1024
activations: Tanh
input_segments:
- mt
- src
word_level_training: False
trainer: ../trainer.yaml
early_stopping: ../early_stopping.yaml
model_checkpoint: ../model_checkpoint.yaml |
Hi @ricardorei , Thanks for the update. Can I use the same training parameters mentioned in master branch trainer.yaml file? |
Hmm maybe you should change them a bit. For example to train on a single GPU (which is usually faster) and with precision 16 use this: accelerator: gpu
devices: 1
# strategy: ddp # Comment this line for distributed training
precision: 16 You might also want to consider reducing the accumulate_grad_batches: 2 |
What is the format that the data should look like? |
Hi Team,
Can you share the training data and training scripts used for wmt22-cometkiwi-da. We want it reference for training with our own sample reference data.
The text was updated successfully, but these errors were encountered: