You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've been doing some experiments, here are my observations (when using the cpu offload, which is the default):
At the beginning it uses low memory
But as it reaches more steps, the memory increases a lot
This is especially more pronounced with the 768p model. On my 4090 when I run a text2vid or img2vid inference with 768p, it starts out using only like 8G VRAM, but as it reaches around 13 steps, it uses up the entire VRAM (24GB) and at that point significantly slows down the inference speed. Basically, the first 13 steps take like 5 minutes, whereas the last 2 steps alone (14 and 15) take like 15 minutes.
Is this the intended behavior? If we're using cpu offload, shouldn't it keep increasing and have a certain threshold where it never exceeds?
The text was updated successfully, but these errors were encountered:
This is because Pyramid Flow is an autoregressive video generation model, and its VRAM increases as the number of history frames increases. Perhaps you can try empty the cuda cache at the end of each frame and see if that saves VRAM.
I've been doing some experiments, here are my observations (when using the cpu offload, which is the default):
This is especially more pronounced with the 768p model. On my 4090 when I run a text2vid or img2vid inference with 768p, it starts out using only like 8G VRAM, but as it reaches around 13 steps, it uses up the entire VRAM (24GB) and at that point significantly slows down the inference speed. Basically, the first 13 steps take like 5 minutes, whereas the last 2 steps alone (14 and 15) take like 15 minutes.
Is this the intended behavior? If we're using cpu offload, shouldn't it keep increasing and have a certain threshold where it never exceeds?
The text was updated successfully, but these errors were encountered: