You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Dear authors, it is nice to see this amazing work. When I run this code, I found an interesting phenomenon that when loading the model, it occupies more GPU memory. And when the training starts, the GPU memory consumption will be stabilized at a value slightly lower than the former.
For example, when I run openlm-research/open_llama_7b with deepspeed --master_port "$port" --include localhost:"$CUDA_VISIBLE_DEVICES" src/train_lomo.py config/args_lomo.yaml on a single V100 GPU with batch_size set to 1 and others left to the default values, I find that the GPU memory consumption is first 18588MB before training starts, and during training, it is stabilized at 15933MB. Can you provide more information about this phenomenon? Many thanks!
The text was updated successfully, but these errors were encountered:
Hi. It's due to some intermediate variables when calling AutoModelForCausalLM.from_pretrained(). And when init deepspeed engine, torch.cuda.empty_cache() will be called to release the memory occupied by these intermediate variables.
Hi. It's due to some intermediate variables when calling AutoModelForCausalLM.from_pretrained(). And when init deepspeed engine, torch.cuda.empty_cache() will be called to release the memory occupied by these intermediate variables.
Dear authors, it is nice to see this amazing work. When I run this code, I found an interesting phenomenon that when loading the model, it occupies more GPU memory. And when the training starts, the GPU memory consumption will be stabilized at a value slightly lower than the former.
For example, when I run
openlm-research/open_llama_7b
withdeepspeed --master_port "$port" --include localhost:"$CUDA_VISIBLE_DEVICES" src/train_lomo.py config/args_lomo.yaml
on a single V100 GPU withbatch_size
set to 1 and others left to the default values, I find that the GPU memory consumption is first 18588MB before training starts, and during training, it is stabilized at 15933MB. Can you provide more information about this phenomenon? Many thanks!The text was updated successfully, but these errors were encountered: