You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Trying to quantise some flux models to lower the vram needs and I get that error.
(venv) C:\AI\llama.cpp\build>bin\Debug\llama-quantize.exe "C:\AI\ComfyUI_windows_portable\ComfyUI\models\checkpoints\flux\fluxTestModel_small-F16.gguf" "C:\AI\ComfyUI_windows_portable\ComfyUI\models\checkpoints\flux\fluxTestModel_small-Q4_k_m.gguf" Q4_K_M
main: build = 3600 (2fb92678)
main: built with MSVC 19.41.34120.0 for x64
main: quantizing 'C:\AI\ComfyUI_windows_portable\ComfyUI\models\checkpoints\flux\fluxTestModel_small-F16.gguf' to 'C:\AI\ComfyUI_windows_portable\ComfyUI\models\checkpoints\flux\fluxTestModel_small-Q4_k_m.gguf' as Q4_K_M
llama_model_loader: loaded meta data with 3 key-value pairs and 780 tensors from C:\AI\ComfyUI_windows_portable\ComfyUI\models\checkpoints\flux\fluxTestModel_small-F16.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = flux
llama_model_loader: - kv 1: general.quantization_version u32 = 2
llama_model_loader: - kv 2: general.file_type u32 = 1
llama_model_loader: - type f16: 780 tensors
llama_model_quantize: failed to quantize: unknown model architecture: 'flux'
main: failed to quantize model from 'C:\AI\ComfyUI_windows_portable\ComfyUI\models\checkpoints\flux\fluxTestModel_small-F16.gguf'
Is flux not supported for quantisastion?
The text was updated successfully, but these errors were encountered:
Did the patch apply successfully? That's the default error when you try to use the base llama.cpp llama-quantize binary without the patch applied iirc.
Okay yeah, that's probably the problem then. The actual upstream repo isn't meant for image models; the patch is the part that adds support for quantizing flux.
If you post the actual error where the patch apply fails I might be able to help out.
(It could just be this as well, i.e. line ending mismatch due to git converting them when cloning: #90 (comment) )
Trying to quantise some flux models to lower the vram needs and I get that error.
Is flux not supported for quantisastion?
The text was updated successfully, but these errors were encountered: