You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello!
I used a huge model to do finetune training. I had 80g of gpu memory, and still reported errors exceeding gpu memory, but when I looked at the gpu memory usage, the peak gpu memory only reached 30. How to solve this problem? Thank you! RuntimeError: CUDA out of memory. Tried to allocate 1.25 GiB (GPU 0; 44.56 GiB total capacity; 41.63 GiB already allocated; 217.56 MiB free; 42.35 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF Exception in thread Thread-6:
The text was updated successfully, but these errors were encountered:
Thank you for providing the details, @Airilin. Based on your settings, it seems that the training should be able to run smoothly on a GPU with 80GB of VRAM. Unfortunately, I don't have additional suggestions at the moment.
Hello!
I used a huge model to do finetune training. I had 80g of gpu memory, and still reported errors exceeding gpu memory, but when I looked at the gpu memory usage, the peak gpu memory only reached 30. How to solve this problem? Thank you!
RuntimeError: CUDA out of memory. Tried to allocate 1.25 GiB (GPU 0; 44.56 GiB total capacity; 41.63 GiB already allocated; 217.56 MiB free; 42.35 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF Exception in thread Thread-6:
The text was updated successfully, but these errors were encountered: