-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regarding Inference Time #20
Comments
Maybe this issue will help - #1 |
Did you try running with autocast mode on and with fp16 weights? I think it's likely to be the default on the Python side, on the rust side you may want to use the |
Thank you for your suggestions. I have tried with the autocast feature. I got results in 9-10s. Is there any way to reduce inference time more? Also one more thing, I am really sorry for above stats. Its incorrect because I was confused with some other results. Actually Rust SD took around 12-13s to generate image. Whereas normal SD pipeline took around 7-8s to generate image. |
Quite soon, there will supposedly be "Distilled Stable Diffusion" that should reduce inference time by at least 20x, maybe even more: https://twitter.com/EMostaque/status/1598131202044866560 The numbers are a bit confusing, but I think he means it's a 20x speedup in time per step, and additionally also only needing 1-4 steps for a good image. So in total more like a 100x speedup compared to now. Obviously I have no idea when exactly that will be available and how soon it can be implemented in this Rust version, but I hope it will be ideal for anyone who needs fast inference speed. |
Thank you for your suggestion. I will surely check them out once its available. |
I have tried with rust implementation of Stable diffusion v2 on A100 gpu with 40gb of ram. Normal stable diffusion pipeline from huggingface takes around 7-8s to generate an image whereas rust implementation takes around 12-13s. It will be really helpful if someone can explain that why is huggingface taking less time compared to rust implementation or am I missing something while running rust implementation?
Thanks!!
The text was updated successfully, but these errors were encountered: