Shrinking Llama training to suite one GPU #777

mahmoodn · 2024-11-15T11:30:08Z

Hi,
Is it possible to run Llama training on 1 GPU for a test? I have tested with smaller sequence length and batch size of 1, but it seems that due to using Deepspeed in distributed_type: DEEPSPEED, it has to be a multi-node configuration. I can not find any other option other than DEEPSPEED.

Any idea about that?

The text was updated successfully, but these errors were encountered:

mahmoodn · 2024-11-22T14:33:39Z

I have tried all possible options based on the suggestions on the web, e.g. increasing lora_size but it seems that 70b version needs 4 gpus.
I tried the same dataset with 7b and I was able to run it on single GPU with batch size of 1, 2, 4 and 8.

mahmoodn closed this as completed Nov 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shrinking Llama training to suite one GPU #777

Shrinking Llama training to suite one GPU #777

mahmoodn commented Nov 15, 2024

mahmoodn commented Nov 22, 2024

Shrinking Llama training to suite one GPU #777

Shrinking Llama training to suite one GPU #777

Comments

mahmoodn commented Nov 15, 2024

mahmoodn commented Nov 22, 2024