Megatron FP8 training is compatible with recompute? #1252
Replies: 1 comment
-
Hi yanchenmochen, |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Your question
Ask a clear and concise question about Megatron-LM.
when I tried run the training using fp8 parameters, OOM occurs. the model is 7B model. same parameter is valid for --bf16.
How to set he correct fp8 parameter? I tried to reduce memory usage to avoid OOM by using--recalculate-granularity selective, but failed, still OOM
Beta Was this translation helpful? Give feedback.
All reactions