Merge the model and store in Google Drive (Section) #15

KabaTubare · 2023-08-27T13:50:43Z

It always runs out of memory...please remedy this issue. Error I get constantly and I am using Colab Pro V100 which should be enough for this project i think: 0/3 [02:11<?, ?it/s]

OutOfMemoryError Traceback (most recent call last)
in <cell line: 8>()
6
7 # Reload model in FP16 and merge it with LoRA weights
----> 8 base_model = AutoModelForCausalLM.from_pretrained(
9 model_name,
10 low_cpu_mem_usage=True,

4 frames
/usr/local/lib/python3.10/dist-packages/accelerate/utils/modeling.py in set_module_tensor_to_device(module, tensor_name, device, value, dtype, fp16_statistics)
296 module._parameters[tensor_name] = param_cls(new_value, requires_grad=old_value.requires_grad)
297 elif isinstance(value, torch.Tensor):
--> 298 new_value = value.to(device)
299 else:
300 new_value = torch.tensor(value, device=device)

OutOfMemoryError: CUDA out of memory. Tried to allocate 314.00 MiB (GPU 0; 15.77 GiB total capacity; 14.32 GiB already allocated; 2.12 MiB free; 14.45 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

hieuminh65 · 2023-08-31T02:46:59Z

I have this problem too

hieuminh65 · 2023-08-31T03:32:02Z

Hey I use V100 and it works, have you turned on high RAM?

After trained the model and got it in the google drive folder, you should restart the runtime and only run cells which run inference. That will save RAM and Storage to load the model

KabaTubare · 2023-09-15T00:58:05Z

This still does not work, but I figure eventually the authors of this project or someone else will get it right as there are others coming quickly to fill this turn key solution to llm training. Does not seem the authors relaize that this thread is used by tech firms to find issues, but also see how they are resolved

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge the model and store in Google Drive (Section) #15

Merge the model and store in Google Drive (Section) #15

KabaTubare commented Aug 27, 2023

hieuminh65 commented Aug 31, 2023

hieuminh65 commented Aug 31, 2023

KabaTubare commented Sep 15, 2023

Merge the model and store in Google Drive (Section) #15

Merge the model and store in Google Drive (Section) #15

Comments

KabaTubare commented Aug 27, 2023

It always runs out of memory...please remedy this issue. Error I get constantly and I am using Colab Pro V100 which should be enough for this project i think: 0/3 [02:11<?, ?it/s]

hieuminh65 commented Aug 31, 2023

hieuminh65 commented Aug 31, 2023

KabaTubare commented Sep 15, 2023