Reducing container file size #20

TowhidKashem · 2024-04-29T21:01:09Z

TowhidKashem
Apr 29, 2024

Hey again!

I had a follow up question and figured it would make more sense to post here than the original thread...

So I was able to get the standalone version working on a runpod serverless endpoint, then I built my own image -> published to docker hub -> and again was able to successfully deploy the custom image to the serverless endpoint. Both are working great and as advertised. No issues on that end!

There's just 1 thing that's bothering me a bit and that's the size of my container:

It's not much different from the latest standalone release version which is 19.05 GB, and that makes sense cause I just copied it exactly without making any changes (just wanted to see if I could deploy my own image first before making changes). Now that I know I can I was hoping I could cut some fat so the size gets smaller and hopefully that translates to less compute costs for the endpoint under heavy usage.

So I was hoping you could answer these 2 questions:

I noticed that the image for the fooocus-api that this runpod version is based on is much smaller 3.48 GB. Do you know why this is?
My other question was, for text2image and img2img do you know which of these models are the absolute minimum that I need to run Fooocus?:

RunPod-Fooocus-API/Dockerfile

Line 20 in 0b6e049

    
           # These are all the models Fooocus needs by default (you can download them also from https://huggingface.co/3WaD/RunPod-Fooocus-API/tree/main)

I know in the comment you say these are all needed but if I am not using many of the features of Fooocus and only need text2img for the obvious purpose and for img2img I am only using the ImagePrompt, FaceSwap and PyraCanny options...and not explicitly using inpainting or anything like that. So does the same rule apply?

I just don't want to start deleting things cause I suspect Fooocus might still be using some of these models under the hood for the tasks I mentioned even if they aren't immediately clear based on the names. For example I have a feeling that the FaceSwap feature is made possible through segmentation by models like detection_Resnet50_Final and ip-adapter-plus-face_sdxl_vit-h.bin.

Sorry if this last one is more of a Fooocus question and not a RunPod-Fooocus-API one, just let me know in that case and I'm happy to ask it on their repo!

As always much thanks, you've been crazy helpful answering my many many questions haha

Answered by davefojtik

Apr 30, 2024

Since container size on its own won't affect its generation speed (the model sizes and resources of course can, but only when they're loaded for a task), the main reason to keep them tiny is your own time needed to upload them online and perhaps waiting for a new workers to download them.

But of course, it makes sense to remove models that you're sure you won't need. In fact, they're making most of the image size (currently uncompressed 17.7GB), which is also the reason why the konieshadow/fooocus-api is much smaller since it does not include them but instead downloads the first time you launch it, and then at times you hit certain controlnet endpoints to download the rest. As you already…

View full answer

davefojtik · 2024-04-30T11:54:23Z

davefojtik
Apr 30, 2024
Maintainer

Since container size on its own won't affect its generation speed (the model sizes and resources of course can, but only when they're loaded for a task), the main reason to keep them tiny is your own time needed to upload them online and perhaps waiting for a new workers to download them.

But of course, it makes sense to remove models that you're sure you won't need. In fact, they're making most of the image size (currently uncompressed 17.7GB), which is also the reason why the konieshadow/fooocus-api is much smaller since it does not include them but instead downloads the first time you launch it, and then at times you hit certain controlnet endpoints to download the rest. As you already know that is not what we can afford to do, since the workers are non-persistent and downloading the models each time would be extremely ineffective.

Here's a list of models and when they're used:

"juggernautXL_v8Rundiffusion.safetensors (and all /checkpoints models)": "base model",
"sd_xl_offset_example-lora_1.0.safetensors (and your own /loras files)": "base LoRa",
"sdxl_lightning_4step_lora.safetensors": "used for Lighting speed option",
"sdxl_lcm_lora.safetensors": "used for Extreme Speed option",
"fooocus_inpaint_head.pth": "used for inpainting",
"inpaint.fooocus.patch": "used for inpainting",
"inpaint_v25.fooocus.patch": "used for inpainting when inpaint_engine:v2.5",
"inpaint_v26.fooocus.patch": "used for inpainting when inpaint_engine:v2.6",
"control-lora-canny-rank128.safetensors": "used in img2img and txt2imgip when cn_type:PyraCanny",
"fooocus_xl_cpds_128.safetensors": "used in img2img and txt2imgip when cn_type:CPDS",
"fooocus_ip_negative.safetensors": "when mixing image prompt and vary upscale, mixing image prompt and inpaint or face"
"ip-adapter-plus_sdxl_vit-h.bin": "when mixing image prompt and vary upscale, mixing image prompt and inpaint",
"ip-adapter-plus-face_sdxl_vit-h.bin": "used for inpainting when cn_type:Face",
"fooocus_upscaler_s409985e5.bin": "used for inpaint/outpaint and upscale/vary",
"clip_vision_vit_h.safetensors": "when mixing image prompt and vary upscale, mixing image prompt and inpaint or face",
"xlvaeapp.pth": "base SDXL VAE",
"vaeapp_sd15.pth": "base SD1.5 VAE",
"xl-to-v1_interposer-v3.1.safetensors": "used when interposing SD1.5 LoRas to SDXL",
"pytorch_model.bin": "base Fooocus expansion model",
"detection_Resnet50_Final.pth": "used for face restoration",
"parsing_parsenet.pth": "used for face restoration",
"model_base_caption_capfilt_large.pth": "used for CLIP interrogation"

Whenever unsure, you can always spin up a local Fooocus installation or Fooocus-API in docker, delete models and run the processes you're planning to use to see what files are downloaded in the console output.

3 replies

TowhidKashem Apr 30, 2024
Author

Ah thank you! That makes a lot of sense, I forgot that the base API was downloading them at run time when relevant methods were first called! I'll experiment and see if I can get away with omitting some of these but good to know that it won't directly impact the generation time. I wasn't sure if that was the case or not.

btw on an unrelated note, last night I updated my custom copy to grab ur latest changes from the last release:

https://github.com/davefojtik/RunPod-Fooocus-API/releases/tag/v0.4.0.6-Standalone

and ran into an error when building the image at this step:

RUN wget -q -O /workspace/repositories/Fooocus/models/checkpoints/RealVisXL_V4.0_Lightning.safetensors https://huggingface.co/SG161222/RealVisXL_V4.0_Lightning/resolve/main/RealVisXL_V4.0_Lightning.safetensors?download=true

error:

/workspace/repositories/Fooocus/models/checkpoints/RealVisXL_V4.0_Lightning.safetensors: No such file or directory

I thought maybe it's due to this change where the Fooocus repo is no longer checked out (only the Fooocus-API is now):

But it might also be due to the way I am using wget to download the model and place it inside the container in one command. Maybe I am doing that wrong cause previously I was moving a local copy of the model over to the container and that worked fine.

Just thought I'd mention in case removing the Fooocus checkout line was a typo and not intentional. Otherwise I can always go back to just doing it the previous way that's not a problem...

davefojtik Apr 30, 2024
Maintainer

Oops, you're right. Looks like I forgot to give that commented line enough attention in recent updates. Try:

RUN wget -P /workspace/repositories/Fooocus/models/checkpoints/ -O RealVisXL_V4.0_Lightning.safetensors https://huggingface.co/SG161222/RealVisXL_V4.0_Lightning/resolve/main/RealVisXL_V4.0_Lightning.safetensors?download=true

Will fix that in the repo along with a missing sdxl_lightning_4step_lora.safetensors download you helped me to spot today.

TowhidKashem Apr 30, 2024
Author

sounds good, glad I can help! I don't know if it's as straightforward as it looks but the base api has this git workflow to build the image in CI:

https://github.com/mrhan1993/Fooocus-API/blob/main/.github/workflows/docker-image.yml

https://github.com/mrhan1993/Fooocus-API/actions/runs/8810776863/job/24183681908

might be worth looking into if it doesn't require paid account or anything..

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reducing container file size #20

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 3 replies

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Reducing container file size #20

TowhidKashem Apr 29, 2024

Replies: 1 comment · 3 replies

davefojtik Apr 30, 2024 Maintainer

TowhidKashem Apr 30, 2024 Author

davefojtik Apr 30, 2024 Maintainer

TowhidKashem Apr 30, 2024 Author

TowhidKashem
Apr 29, 2024

Replies: 1 comment 3 replies

davefojtik
Apr 30, 2024
Maintainer

TowhidKashem Apr 30, 2024
Author

davefojtik Apr 30, 2024
Maintainer

TowhidKashem Apr 30, 2024
Author