-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segmentation fault when using the dev container #16
Comments
solve by using 24.07 image and install Nemo-Run + upgrade Nemo (build from source) manually |
Thanks @jeffchy for creating the issue. Glad to know you were able to fix it. Please let us know if you run into this issue again, and if it's ok to close the issue for now since you were able to solve it. |
I'm able pass the phase I mentioned above, but it then raise CheckPointError |
@jeffchy is that the same error as above or a new one? Could you share it if it's new? |
it's a new one, I'll try to reproduce the error. |
Update: I can successfully run the newest pretrain recipe https://github.com/NVIDIA/NeMo/blob/main/examples/llm/run/llama3_pretraining.py but failed when I want to use fientune_recipe and own model.
And I got
I'm not familiar with nemo, maybe I got something wrong? |
|
Thanks for your reply, but if I have a custom fine-tuned HF model (on local device), how to start from it? Do I need to convert it in advance? |
Segmentation fault when using the dev container to train the llm finetune recipe:
The text was updated successfully, but these errors were encountered: