-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CPU Offloading #8
Comments
CPU off loading is used to reduce memory usage. The error you encountered is because our |
@JY-Joy Thank you for the response. I wonder if there is a way to carve off the VRAM usage a bit more. I tried experimenting with some of the diffusers memory save techniques with no success, but I think you might know best. It would be really great if this works on consumer grade PCs (I can confirm it works on 4090 and Mac M1 Max with 64G memory--although consumes like 47G memory during inference) Here's one user who wanted to try but failed https://x.com/Teslanaut/status/1854985331915034995 Do you think there's room for any optimization? |
Yeah ofc there is room for optimization, we will have those diffusers optimizations supported in the near future. However, it is really quite weird that InstantIR consumes 47GB VRAM. I checked the twi you mentioned. The user reported that there are 20 MiB VRAM missing on a 3080Ti device, which is of 12GB total capacity. Sadly in our current implementation it is recommended to deploy InstantIR on devices with at least 22GB VRAM. We will try to optimize this but I've to say there will be a trade off with efficiency. |
i run mac and pinokio. it might be useful to understand , that usually mac owners own 32gb of ram and almost never 64Gb of ram, for mac uses HD space as Virtual Memory, so there is never a need to purchase 64Gb of ram anymore like it used to be in the past. Sadly your amazing killer top notch app does not work with 32Gb of ram on macs when using pinokio and your app in it. i also think there are many Mac users who are no aware that they could voice for such RAM optimization right here with the coders themselves, otherwise you would have many, many, many more people wondering if this can be optimized for 32gb ram on macs and Pinokio. This would be so amazing if you could make it work on macs with 32GB of ram :) |
I'd also love to CPU offload this with the ability for it to run on a system with 16GB or less, understanding that it'll slow it down. I've integrated InstantIR as an upscale method in my app at AEIONic.com to be an option along with RealESRGAN, AuraSR, etc. and got it working within all the pipelines, however I missunderstood that it's not necessarily an upscaler but more an image repairer. I also didn't expect it to take up all the vram max out, but at least it still runs even though it takes an hour for one. I tried enabling CPU Offloading too, got the same error as above, and dug into the pipelines and realized it wasn't quite implemented in the aggregator and I couldn't figure out a hack. Wish I had 24GB+ card, but any chance of optimizing it somehow? Maybe there's a way to use TorchAO or Quantize it or Bitsandbites or something? It'd be nice if it officially gets adapted into Diffusers library.. |
Thanks for your careful investigation @Skquark, and InstantIR is indeed designed to be a image repairer. By upscaling, did you mean output images larger than 1024px? At present the maximum px output is constraint by SDXL's capacity, and your device of course. |
For sure, well it still has its usefulness, just not for my upscale intent. I'm going to move it in my app to it's own tab and make it a utility tool instead of image post-processor. I still won't be able to run it on my own computer, but others can get some use out of it. Thanks, let us know when we have a way to optimize... |
|
I saw the line
pipe.enable_model_cpu_offload()
here https://github.com/instantX-research/InstantIR/blob/main/pipelines/sdxl_instantir.py#L113C13-L113C44 and tried the approach with the gradio app, but get the following error:What else needs to be done in the code to fix this error and make this work?
The text was updated successfully, but these errors were encountered: