Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: win-vulkan-x64 crashed since b3831 #9708

Closed
cwt opened this issue Oct 1, 2024 · 1 comment
Closed

Bug: win-vulkan-x64 crashed since b3831 #9708

cwt opened this issue Oct 1, 2024 · 1 comment
Labels
bug-unconfirmed critical severity Used to report critical severity bugs in llama.cpp (e.g. Crashing, Corrupted, Dataloss) stale

Comments

@cwt
Copy link

cwt commented Oct 1, 2024

What happened?

Since the rebar (#9251) merged in b3831, loading to AMD RX 580 on Windows via Vulkan crashed.
FYI:

  • My mainboard (Z97-HD3) does not support rebar.
  • Releases before b3831 work fine.

Name and Version

❯ .\llama-cli.exe --version
version: 3831 (89f99449)
built with MSVC 19.29.30154.0 for x64

What operating system are you seeing the problem on?

Windows

Relevant log output

ggml_vulkan: Found 1 Vulkan devices:
Vulkan0: Radeon RX 580 Series (AMD proprietary driver) | uma: 0 | fp16: 0 | warp size: 64
llm_load_tensors: ggml ctx size =    0.27 MiB
llm_load_tensors: offloading 32 repeating layers to GPU
llm_load_tensors: offloading non-repeating layers to GPU
llm_load_tensors: offloaded 33/33 layers to GPU
llm_load_tensors: Radeon RX 580 Series buffer size =  4095.05 MiB
llm_load_tensors:        CPU buffer size =    70.31 MiB
..............................................................................................
llama_new_context_with_model: n_ctx      = 2048
llama_new_context_with_model: n_batch    = 2048
llama_new_context_with_model: n_ubatch   = 512
llama_new_context_with_model: flash_attn = 0
llama_new_context_with_model: freq_base  = 10000.0
llama_new_context_with_model: freq_scale = 1
ggml_vulkan: Device memory allocation of size 268435456 failed.
ggml_vulkan: vk::Device::allocateMemory: ErrorOutOfDeviceMemory
llama_kv_cache_init: failed to allocate buffer for kv cache
llama_new_context_with_model: llama_kv_cache_init() failed for self-attention cache
llama_init_from_gpt_params: failed to create context with model '..\westlake-7b-v2.Q4_K_M.gguf'
main: error: unable to load model
@cwt cwt added bug-unconfirmed critical severity Used to report critical severity bugs in llama.cpp (e.g. Crashing, Corrupted, Dataloss) labels Oct 1, 2024
@github-actions github-actions bot added the stale label Nov 1, 2024
Copy link
Contributor

This issue was closed because it has been inactive for 14 days since being marked as stale.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug-unconfirmed critical severity Used to report critical severity bugs in llama.cpp (e.g. Crashing, Corrupted, Dataloss) stale
Projects
None yet
Development

No branches or pull requests

1 participant