We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
如标题, 在进行预训练的过程中,我使用的服务器发生了异常。我该如何继续进行预训练,请各位老师指点一下。
The text was updated successfully, but these errors were encountered:
每隔一定训练steps就保存模型checkpoint,训练的参数以及优化器的参数,pytorch提供了torch.save(model.state_dict, path), model.load_state_dict()接口,可以保存这些参数
Sorry, something went wrong.
可以看一下这个项目,使用transformers库进行训练,支持断点训练,zero等优化技术。 https://github.com/wdndev/tiny-llm-zh
No branches or pull requests
如标题, 在进行预训练的过程中,我使用的服务器发生了异常。我该如何继续进行预训练,请各位老师指点一下。
The text was updated successfully, but these errors were encountered: