-
Notifications
You must be signed in to change notification settings - Fork 337
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SyCL intel build broken? #542
Comments
Similar issue on Arch Linux, also with intel-oneapi-basekit 2025.0.1. GPU is Battlemage B580 ./build/bin/sd --diffusion-model /mnt/hdd1/sd-models/flux1-dev-q4_k.gguf --vae /mnt/hdd1/sd-models/ae.safetensors --clip_l /mnt/hdd1/sd-models/clip_l.safetensors --t5xxl /mnt/hdd1/sd-models/t5xxl_fp16.safetensors -p "a lovely cat holding a sign says 'Battlemage B580'" --cfg-scale 1.0 --sampling-method euler_a -v
Option:
n_threads: 8
mode: txt2img
model_path:
wtype: unspecified
clip_l_path: /mnt/hdd1/sd-models/clip_l.safetensors
clip_g_path:
t5xxl_path: /mnt/hdd1/sd-models/t5xxl_fp16.safetensors
diffusion_model_path: /mnt/hdd1/sd-models/flux1-dev-q4_k.gguf
vae_path: /mnt/hdd1/sd-models/ae.safetensors
taesd_path:
esrgan_path:
controlnet_path:
embeddings_path:
stacked_id_embeddings_path:
input_id_images_path:
style ratio: 20.00
normalize input image : false
output_path: output.png
init_img:
mask_img:
control_image:
clip on cpu: false
controlnet cpu: false
vae decoder on cpu:false
diffusion flash attention:false
strength(control): 0.90
prompt: a lovely cat holding a sign says 'Battlemage B580'
negative_prompt:
min_cfg: 1.00
cfg_scale: 1.00
slg_scale: 0.00
guidance: 3.50
clip_skip: -1
width: 512
height: 512
sample_method: euler_a
schedule: default
sample_steps: 20
strength(img2img): 0.75
rng: cuda
seed: 42
batch_count: 1
vae_tiling: false
upscale_repeats: 1
System Info:
SSE3 = 1
AVX = 1
AVX2 = 1
AVX512 = 0
AVX512_VBMI = 0
AVX512_VNNI = 0
FMA = 1
NEON = 0
ARM_FMA = 0
F16C = 1
FP16_VA = 0
WASM_SIMD = 0
VSX = 0
[DEBUG] stable-diffusion.cpp:181 - Using SYCL backend
[SYCL] call ggml_check_sycl
ggml_check_sycl: GGML_SYCL_DEBUG: 0
ggml_check_sycl: GGML_SYCL_F16: no
found 1 SYCL devices:
| | | | |Max | |Max |Global | |
| | | | |compute|Max work|sub |mem | |
|ID| Device Type| Name|Version|units |group |group|size | Driver version|
|--|-------------------|---------------------------------------|-------|-------|--------|-----|-------|---------------------|
| 0| [level_zero:gpu:0]| Intel Graphics [0xe20b]| 20.1| 160| 1024| 32| 12168M| 1.6.31907|
ggml_sycl_init: GGML_SYCL_FORCE_MMQ: no
ggml_sycl_init: SYCL_USE_XMX: yes
ggml_sycl_init: found 1 SYCL devices:
[INFO ] stable-diffusion.cpp:202 - loading clip_l from '/mnt/hdd1/sd-models/clip_l.safetensors'
[INFO ] model.cpp:888 - load /mnt/hdd1/sd-models/clip_l.safetensors using safetensors format
[DEBUG] model.cpp:959 - init from '/mnt/hdd1/sd-models/clip_l.safetensors'
[INFO ] stable-diffusion.cpp:216 - loading t5xxl from '/mnt/hdd1/sd-models/t5xxl_fp16.safetensors'
[INFO ] model.cpp:888 - load /mnt/hdd1/sd-models/t5xxl_fp16.safetensors using safetensors format
[DEBUG] model.cpp:959 - init from '/mnt/hdd1/sd-models/t5xxl_fp16.safetensors'
[INFO ] stable-diffusion.cpp:223 - loading diffusion model from '/mnt/hdd1/sd-models/flux1-dev-q4_k.gguf'
[INFO ] model.cpp:885 - load /mnt/hdd1/sd-models/flux1-dev-q4_k.gguf using gguf format
[DEBUG] model.cpp:902 - init from '/mnt/hdd1/sd-models/flux1-dev-q4_k.gguf'
[INFO ] stable-diffusion.cpp:230 - loading vae from '/mnt/hdd1/sd-models/ae.safetensors'
[INFO ] model.cpp:888 - load /mnt/hdd1/sd-models/ae.safetensors using safetensors format
[DEBUG] model.cpp:959 - init from '/mnt/hdd1/sd-models/ae.safetensors'
[INFO ] stable-diffusion.cpp:242 - Version: Flux
[INFO ] stable-diffusion.cpp:275 - Weight type: f16
[INFO ] stable-diffusion.cpp:276 - Conditioner weight type: f16
[INFO ] stable-diffusion.cpp:277 - Diffusion model weight type: q4_K
[INFO ] stable-diffusion.cpp:278 - VAE weight type: f32
[DEBUG] stable-diffusion.cpp:280 - ggml tensor size = 400 bytes
[INFO ] stable-diffusion.cpp:319 - set clip_on_cpu to true
[INFO ] stable-diffusion.cpp:322 - CLIP: Using CPU backend
[DEBUG] clip.hpp:171 - vocab size: 49408
[DEBUG] clip.hpp:182 - trigger word img already in vocab
[INFO ] flux.hpp:889 - Flux blocks: 19 double, 38 single
[DEBUG] ggml_extend.hpp:1111 - clip params backend buffer size = 235.06 MB(RAM) (196 tensors)
[DEBUG] ggml_extend.hpp:1111 - t5 params backend buffer size = 9083.77 MB(RAM) (219 tensors)
[DEBUG] ggml_extend.hpp:1111 - flux params backend buffer size = 6604.64 MB(VRAM) (780 tensors)
[DEBUG] ggml_extend.hpp:1111 - vae params backend buffer size = 94.57 MB(VRAM) (138 tensors)
[DEBUG] stable-diffusion.cpp:417 - loading weights
[DEBUG] model.cpp:1698 - loading tensors from /mnt/hdd1/sd-models/clip_l.safetensors
|======> | 196/1440 - 0.00it/s[DEBUG] model.cpp:1698 - loading tensors from /mnt/hdd1/sd-models/t5xxl_fp16.safetensors
|==============> | 413/1440 - 0.00it/s[INFO ] model.cpp:1868 - unknown tensor 'text_encoders.t5xxl.transformer.encoder.embed_tokens.weight | f16 | 2 [4096, 32128, 1, 1, 1]' in model file
|==============> | 416/1440 - 9.43it/s[DEBUG] model.cpp:1698 - loading tensors from /mnt/hdd1/sd-models/flux1-dev-q4_k.gguf
|=========================================> | 1196/1440 - 50.00it/s[DEBUG] model.cpp:1698 - loading tensors from /mnt/hdd1/sd-models/ae.safetensors
|==============================================> | 1334/1440 - 200.00it/s[INFO ] stable-diffusion.cpp:516 - total params memory size = 16018.05MB (VRAM 6699.22MB, RAM 9318.83MB): clip 9318.83MB(RAM), unet 6604.64MB(VRAM), vae 94.57MB(VRAM), controlnet 0.00MB(VRAM), pmid 0.00MB(RAM)
[INFO ] stable-diffusion.cpp:520 - loading model from '' completed, taking 6.22s
[INFO ] stable-diffusion.cpp:537 - running in Flux FLOW mode
[DEBUG] stable-diffusion.cpp:594 - finished loaded file
[DEBUG] stable-diffusion.cpp:1535 - txt2img 512x512
[DEBUG] stable-diffusion.cpp:1230 - prompt after extract and remove lora: "a lovely cat holding a sign says 'Battlemage B580'"
[INFO ] stable-diffusion.cpp:682 - Attempting to apply 0 LoRAs
[INFO ] stable-diffusion.cpp:1235 - apply_loras completed, taking 0.00s
[DEBUG] conditioner.hpp:1027 - parse 'a lovely cat holding a sign says 'Battlemage B580'' to [['a lovely cat holding a sign says 'Battlemage B580'', 1], ]
[DEBUG] clip.hpp:311 - token length: 77
[DEBUG] t5.hpp:397 - token length: 256
[DEBUG] clip.hpp:736 - Missing text_projection matrix, assuming identity...
[DEBUG] ggml_extend.hpp:1062 - clip compute buffer size: 1.40 MB(RAM)
[DEBUG] clip.hpp:736 - Missing text_projection matrix, assuming identity...
[DEBUG] ggml_extend.hpp:1062 - t5 compute buffer size: 68.25 MB(RAM)
[DEBUG] conditioner.hpp:1142 - computing condition graph completed, taking 5611 ms
[INFO ] stable-diffusion.cpp:1368 - get_learned_condition completed, taking 5613 ms
[INFO ] stable-diffusion.cpp:1391 - sampling using Euler A method
[INFO ] stable-diffusion.cpp:1428 - generating image: 1/1 - seed 42
[DEBUG] stable-diffusion.cpp:798 - Sample
[DEBUG] ggml_extend.hpp:1062 - flux compute buffer size: 398.50 MB(VRAM)
No kernel named _ZTSZN14bin_bcast_syclIXadL_ZL6op_addffEEEclIfffEEvR25ggml_backend_sycl_contextPK11ggml_tensorS6_PS4_PKT_PKT0_PT1_PN4sycl3_V15queueEEUlNSH_7nd_itemILi3EEEE0_ was foundException caught at file:/mnt/hdd1/ipex/stable-diffusion.cpp/ggml/src/ggml-sycl/common.cpp, line:102 |
only version currently working for me is 0.2.1 from stable-diffusion-cpp-python |
Thanks for the heads-up! This one actually works :D |
this looks like a regression somewhere, 1c168d9 works |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Notes:
The text was updated successfully, but these errors were encountered: