You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
i'm using the deepspeed to tran the animat anyone.i want to use stage3 to reduce the cost of gpu vram. theoretically,stage3 will reduce more memory than stage2. but in fact, it didn't
To Reproduce
use this repo
use deepspeed with accelerate to train the stage 2(the 2nd stage of this model not the stage 2 of deepspeed). and you will find the deepspeed stage3 will cost more memory than deepspeed stage2. besides, the cpu offload has no help to reduce memory,i was confused
The text was updated successfully, but these errors were encountered:
Describe the bug
i'm using the deepspeed to tran the animat anyone.i want to use stage3 to reduce the cost of gpu vram. theoretically,stage3 will reduce more memory than stage2. but in fact, it didn't
To Reproduce
use this repo
use deepspeed with accelerate to train the stage 2(the 2nd stage of this model not the stage 2 of deepspeed). and you will find the deepspeed stage3 will cost more memory than deepspeed stage2. besides, the cpu offload has no help to reduce memory,i was confused
The text was updated successfully, but these errors were encountered: