Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issue in converting Mixtral 8x7B checkpoints from HF to MCore and update doc #1397

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

yeahdongcn
Copy link

Testing Done

root@a9effa877dd0:/workspace/megatron-lm# python tools/checkpoint/convert.py --model-type GPT --loader mixtral_hf --saver mcore --target-tensor-parallel-size ${TARGET_TP_SIZE} --target-pipeline-parallel-size ${TARGET_PP_SIZE} --target-expert-parallel-size ${TARGET_EP_SIZE} --load-dir ${HF_FORMAT_DIR} --save-dir ${MEGATRON_FORMAT_DIR} --tokenizer-model ${TOKENIZER_MODEL}
...
received transformer layer 30
received transformer layer 31
received final norm
received output layer
 > padded vocab (size: 32000) with 0 dummy tokens (new size: 32000)
saving checkpoint at iteration       1 to /workspace/checkpoints/mixtral-mcore-TP1PP4EP8 in torch format
  successfully saved checkpoint from iteration       1 to /workspace/checkpoints/mixtral-mcore-TP1PP4EP8 [ t 1/1, p 4/4 ]
saving checkpoint at iteration       1 to /workspace/checkpoints/mixtral-mcore-TP1PP4EP8 in torch format
  successfully saved checkpoint from iteration       1 to /workspace/checkpoints/mixtral-mcore-TP1PP4EP8 [ t 1/1, p 4/4 ]
saving checkpoint at iteration       1 to /workspace/checkpoints/mixtral-mcore-TP1PP4EP8 in torch format
  successfully saved checkpoint from iteration       1 to /workspace/checkpoints/mixtral-mcore-TP1PP4EP8 [ t 1/1, p 4/4 ]
saving checkpoint at iteration       1 to /workspace/checkpoints/mixtral-mcore-TP1PP4EP8 in torch format
  successfully saved checkpoint from iteration       1 to /workspace/checkpoints/mixtral-mcore-TP1PP4EP8 [ t 1/1, p 4/4 ]
saving checkpoint at iteration       1 to /workspace/checkpoints/mixtral-mcore-TP1PP4EP8 in torch format
  successfully saved checkpoint from iteration       1 to /workspace/checkpoints/mixtral-mcore-TP1PP4EP8 [ t 1/1, p 4/4 ]
saving checkpoint at iteration       1 to /workspace/checkpoints/mixtral-mcore-TP1PP4EP8 in torch format
  successfully saved checkpoint from iteration       1 to /workspace/checkpoints/mixtral-mcore-TP1PP4EP8 [ t 1/1, p 4/4 ]
saving checkpoint at iteration       1 to /workspace/checkpoints/mixtral-mcore-TP1PP4EP8 in torch format
  successfully saved checkpoint from iteration       1 to /workspace/checkpoints/mixtral-mcore-TP1PP4EP8 [ t 1/1, p 4/4 ]
saving checkpoint at iteration       1 to /workspace/checkpoints/mixtral-mcore-TP1PP4EP8 in torch format
  successfully saved checkpoint from iteration       1 to /workspace/checkpoints/mixtral-mcore-TP1PP4EP8 [ t 1/1, p 4/4 ]
Done!
/usr/lib/python3.12/tempfile.py:1075: ResourceWarning: Implicitly cleaning up <TemporaryDirectory '/tmp/tmp7agu7eyr'>
  _warnings.warn(warn_message, ResourceWarning)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant