You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a fantastic tool! I've been using it to generate several one-off images and I'm excited to expand upon it and contribute in any way I can.
It takes at least 20 minutes to generate a default 512x512 image, and maybe 15 minutes to generate a 256x256 image on an AWS P2.XL instance. I get around 3.26s/it. The result is similar on P2.8XL instance, with 8 available GPUs. This is quite a long runtime, especially with expensive AWS instances.
I want to be able to scale the image generation workload by either running multiple prompts simultaneously or pool the generation across multiple gpus to reduce the draw time to sub 5 minutes.
Parallel prompt generation is more straightforward to accomplish. You can keep a map of active jobs on the visible GPUs and a queue of prompts to distribute among them, spinning up a new generate.py process for each available GPU.
The hard part I have been trying to examine is the draw time. Is there a way to map a single generation job across multiple, distributed or local, GPUs? This question has been raised in issues before, such as #24 and #47, but I want to bring it back. The challenge is that image generation occurs over multiple iterations of the same base seeded image, essentially procedurally generates the next layer, so you can't neatly split the "dataset" like in other ML workloads. I am not as strong in ML engineering as I thought I was, so if someone can help me clarify what I am getting at or tell me where I am wrong in this explanation/approach, I would highly appreciate it and close the issue.
I am familiar with AWS infrastructure such as EC2, Sagemaker, Glue, Batch, so if you think it's an infrastructure or hardware bottleneck, I can research more in that direction.
TLDR: How do we scale up image generation? In what way can we cut down generation time from ~20 minutes to ~5 minutes?
Let me know what you think. Thank you.
The text was updated successfully, but these errors were encountered:
This is a fantastic tool! I've been using it to generate several one-off images and I'm excited to expand upon it and contribute in any way I can.
It takes at least 20 minutes to generate a default 512x512 image, and maybe 15 minutes to generate a 256x256 image on an AWS P2.XL instance. I get around 3.26s/it. The result is similar on P2.8XL instance, with 8 available GPUs. This is quite a long runtime, especially with expensive AWS instances.
I want to be able to scale the image generation workload by either running multiple prompts simultaneously or pool the generation across multiple gpus to reduce the draw time to sub 5 minutes.
Parallel prompt generation is more straightforward to accomplish. You can keep a map of active jobs on the visible GPUs and a queue of prompts to distribute among them, spinning up a new generate.py process for each available GPU.
The hard part I have been trying to examine is the draw time. Is there a way to map a single generation job across multiple, distributed or local, GPUs? This question has been raised in issues before, such as #24 and #47, but I want to bring it back. The challenge is that image generation occurs over multiple iterations of the same base seeded image, essentially procedurally generates the next layer, so you can't neatly split the "dataset" like in other ML workloads. I am not as strong in ML engineering as I thought I was, so if someone can help me clarify what I am getting at or tell me where I am wrong in this explanation/approach, I would highly appreciate it and close the issue.
I am familiar with AWS infrastructure such as EC2, Sagemaker, Glue, Batch, so if you think it's an infrastructure or hardware bottleneck, I can research more in that direction.
TLDR: How do we scale up image generation? In what way can we cut down generation time from ~20 minutes to ~5 minutes?
Let me know what you think. Thank you.
The text was updated successfully, but these errors were encountered: