Skip to content

Latest commit

 

History

History
17 lines (10 loc) · 1.05 KB

README.md

File metadata and controls

17 lines (10 loc) · 1.05 KB

Example Analyses with Megatron-LM Models

Below shows the Table 2 in the paper Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM.

Table 2 in Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM

Training Analysis

llm-analysis are run with the setups described in the paper, and the outputs match the Training time for 300B tokens (days) reported for different schemes.

References