-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
how about make load diffusion model+ support gguf #27
Comments
I'm not sure it would make sense. The difference between the loaders is being able to set some extra dtypes. The advanced GGUF loader lets you enter the dequant and patch dtypes: You can't enter something like FP8 there, but I'm not sure how that would work. Dequant to fast FP8 might be possible, but dequantizing to another quantized format is probably going to be a big quality loss (and likely would need support implemented in ComfyUI-GGUF). It also would only work on GPUs that support the FP8 ops with ComfyUI's |
Do you mean converting gguf models to native format so that it could work with ComfyUI better? |
I'm using gguf which works just like normal safetensors, and it seems that the threshold has a lot to do with the number of steps, |
yes support for Lora should be better since it fuses the Lora into the weights which makes the inference faster |
how about make load diffusion model+ support gguf
there are example load model include gguf,
https://github.com/blepping/ComfyUI_FluxMod/tree/feat_gguf
lodestone-rock/ComfyUI_FluxMod#15
The text was updated successfully, but these errors were encountered: