You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It always runs out of memory...please remedy this issue. Error I get constantly and I am using Colab Pro V100 which should be enough for this project i think: 0/3 [02:11<?, ?it/s]
OutOfMemoryError Traceback (most recent call last) in <cell line: 8>()
6
7 # Reload model in FP16 and merge it with LoRA weights
----> 8 base_model = AutoModelForCausalLM.from_pretrained(
9 model_name,
10 low_cpu_mem_usage=True,
OutOfMemoryError: CUDA out of memory. Tried to allocate 314.00 MiB (GPU 0; 15.77 GiB total capacity; 14.32 GiB already allocated; 2.12 MiB free; 14.45 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
The text was updated successfully, but these errors were encountered:
Hey I use V100 and it works, have you turned on high RAM?
After trained the model and got it in the google drive folder, you should restart the runtime and only run cells which run inference. That will save RAM and Storage to load the model
This still does not work, but I figure eventually the authors of this project or someone else will get it right as there are others coming quickly to fill this turn key solution to llm training. Does not seem the authors relaize that this thread is used by tech firms to find issues, but also see how they are resolved
It always runs out of memory...please remedy this issue. Error I get constantly and I am using Colab Pro V100 which should be enough for this project i think: 0/3 [02:11<?, ?it/s]
OutOfMemoryError Traceback (most recent call last)
in <cell line: 8>()
6
7 # Reload model in FP16 and merge it with LoRA weights
----> 8 base_model = AutoModelForCausalLM.from_pretrained(
9 model_name,
10 low_cpu_mem_usage=True,
4 frames
/usr/local/lib/python3.10/dist-packages/accelerate/utils/modeling.py in set_module_tensor_to_device(module, tensor_name, device, value, dtype, fp16_statistics)
296 module._parameters[tensor_name] = param_cls(new_value, requires_grad=old_value.requires_grad)
297 elif isinstance(value, torch.Tensor):
--> 298 new_value = value.to(device)
299 else:
300 new_value = torch.tensor(value, device=device)
OutOfMemoryError: CUDA out of memory. Tried to allocate 314.00 MiB (GPU 0; 15.77 GiB total capacity; 14.32 GiB already allocated; 2.12 MiB free; 14.45 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
The text was updated successfully, but these errors were encountered: