README.md

Example Analyses with Megatron-LM Models

Training Analysis

llm-analysis are run with the setups described in the paper, and the outputs match the Training time for 300B tokens (days) reported for different schemes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

megatron-lm

megatron-lm

README.md

Example Analyses with Megatron-LM Models

Training Analysis

References

Files

megatron-lm

Directory actions

More options

Directory actions

More options

Latest commit

History

megatron-lm

Folders and files

README.md

Example Analyses with Megatron-LM Models

Training Analysis

References