Are the parameters given in config_X_mira.yaml same as the training parameters? #3

alpercanberk · 2024-04-17T20:20:57Z

If not, could I learn about the training parameters (e.g. effective batch size, learning rate, clipping, etc.)

zzyfd · 2024-04-18T08:55:09Z

Hello, the settings specified in the config_X_mira.yaml file are ready to be used for training Mira on A100 40G GPU.

alpercanberk · 2024-04-20T02:02:52Z

Wait, so the model was trained with 1 A100 and no gradient accumulation?

mira-space · 2024-04-22T15:32:49Z

No, the Mira-v0 model was trained on 32 A100 GPUs for approximately two days.

Provide feedback