Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regarding Inference Time #20

Closed
DhruvThu opened this issue Dec 19, 2022 · 5 comments
Closed

Regarding Inference Time #20

DhruvThu opened this issue Dec 19, 2022 · 5 comments

Comments

@DhruvThu
Copy link

DhruvThu commented Dec 19, 2022

I have tried with rust implementation of Stable diffusion v2 on A100 gpu with 40gb of ram. Normal stable diffusion pipeline from huggingface takes around 7-8s to generate an image whereas rust implementation takes around 12-13s. It will be really helpful if someone can explain that why is huggingface taking less time compared to rust implementation or am I missing something while running rust implementation?

Thanks!!

@sssemil
Copy link
Contributor

sssemil commented Dec 19, 2022

Maybe this issue will help - #1

@LaurentMazare
Copy link
Owner

Did you try running with autocast mode on and with fp16 weights? I think it's likely to be the default on the Python side, on the rust side you may want to use the --autocast flag to do this (though I haven't tested it on stable diffusion 2.1 as my gpu only has 8GB of memory which is not enough even with fp16).

@DhruvThu
Copy link
Author

Thank you for your suggestions. I have tried with the autocast feature. I got results in 9-10s. Is there any way to reduce inference time more?

Also one more thing, I am really sorry for above stats. Its incorrect because I was confused with some other results. Actually Rust SD took around 12-13s to generate image. Whereas normal SD pipeline took around 7-8s to generate image.

@JohnAlcatraz
Copy link

Thank you for your suggestions. I have tried with the autocast feature. I got results in 9-10s. Is there any way to reduce inference time more?

Quite soon, there will supposedly be "Distilled Stable Diffusion" that should reduce inference time by at least 20x, maybe even more:

https://twitter.com/EMostaque/status/1598131202044866560

The numbers are a bit confusing, but I think he means it's a 20x speedup in time per step, and additionally also only needing 1-4 steps for a good image. So in total more like a 100x speedup compared to now.

Obviously I have no idea when exactly that will be available and how soon it can be implemented in this Rust version, but I hope it will be ideal for anyone who needs fast inference speed.

@DhruvThu
Copy link
Author

Thank you for your suggestion. I will surely check them out once its available.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants