Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

failed to quantize: unknown model architecture: 'flux' #133

Open
GamingDaveUk opened this issue Oct 20, 2024 · 3 comments
Open

failed to quantize: unknown model architecture: 'flux' #133

GamingDaveUk opened this issue Oct 20, 2024 · 3 comments

Comments

@GamingDaveUk
Copy link

Trying to quantise some flux models to lower the vram needs and I get that error.

(venv) C:\AI\llama.cpp\build>bin\Debug\llama-quantize.exe "C:\AI\ComfyUI_windows_portable\ComfyUI\models\checkpoints\flux\fluxTestModel_small-F16.gguf" "C:\AI\ComfyUI_windows_portable\ComfyUI\models\checkpoints\flux\fluxTestModel_small-Q4_k_m.gguf" Q4_K_M
main: build = 3600 (2fb92678)
main: built with MSVC 19.41.34120.0 for x64
main: quantizing 'C:\AI\ComfyUI_windows_portable\ComfyUI\models\checkpoints\flux\fluxTestModel_small-F16.gguf' to 'C:\AI\ComfyUI_windows_portable\ComfyUI\models\checkpoints\flux\fluxTestModel_small-Q4_k_m.gguf' as Q4_K_M
llama_model_loader: loaded meta data with 3 key-value pairs and 780 tensors from C:\AI\ComfyUI_windows_portable\ComfyUI\models\checkpoints\flux\fluxTestModel_small-F16.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = flux
llama_model_loader: - kv   1:               general.quantization_version u32              = 2
llama_model_loader: - kv   2:                          general.file_type u32              = 1
llama_model_loader: - type  f16:  780 tensors
llama_model_quantize: failed to quantize: unknown model architecture: 'flux'
main: failed to quantize model from 'C:\AI\ComfyUI_windows_portable\ComfyUI\models\checkpoints\flux\fluxTestModel_small-F16.gguf'

Is flux not supported for quantisastion?

@city96
Copy link
Owner

city96 commented Oct 20, 2024

Did the patch apply successfully? That's the default error when you try to use the base llama.cpp llama-quantize binary without the patch applied iirc.

@GamingDaveUk
Copy link
Author

GamingDaveUk commented Oct 20, 2024

no there was a crc error on the patch, i assumed that it meant the patch was already in the main code

I have llama.cpp installed in its own instance so it was a pain to follow the instruction, i may have messed up a step.

I will try again tomorrow when more awake.

@city96
Copy link
Owner

city96 commented Oct 21, 2024

Okay yeah, that's probably the problem then. The actual upstream repo isn't meant for image models; the patch is the part that adds support for quantizing flux.

If you post the actual error where the patch apply fails I might be able to help out.

(It could just be this as well, i.e. line ending mismatch due to git converting them when cloning: #90 (comment) )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants