-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
can you provide the running config of 65b models? #7
Comments
Hi. You can change |
I have the same problem, llama 65B model with 8 * V100, hit oom, any other parameter should be set ?
message details: |
Sry that I misunderstood your question. |
thx, I succeed run 65B model with batch_size=1 and data_max_length=835, it costs 47min each epoch(26G gpu memery 、 8*v100 node with nvlink). to achieve the performance as has been noted by the paper,is the 3090 Gpus had nvlink? |
Glad to hear that! |
Solved here. #28 |
Hi, I'd like to run a 65B llama with LOMO, what config should I use to run the training on a 8*RTX 3090 machine?
It would be very nice if you add config/args_lomo.yaml and config/ds_config.json for 65b models.
Thanks.
The text was updated successfully, but these errors were encountered: