-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
examples : remove finetune
and train-text-from-scratch
#8669
Conversation
My immediate next priority will be to start working on training in ggml/llama.cpp. Though whether or not these examples are removed on master doesn't matter since they will still be in the git history. I definitely have a strong preference for not having broken code on master since that just needlessly clogs up the issues. |
Good to hear that @JohannesGaessler ! Yeah I also bookmarked some PR when things get removed. I think that if you're rewriting the training or training / finetune examples, probably it's easier to start from a blank file instead of modifying the existing code (correct me if I'm wrong). I did that way when I rewrote |
Sad to see these were just removed. I used |
What are your qualifications relevant to training? |
I'm fluent in CUDA/OpenCL and know the codebase pretty well including the build automation. I may be most helpful as QA/tester if you need one. I also have a pretty good lab of various hardware devices for testing. |
One way that you could definitely help would be with code/PR review. As of right now it is kind of a bottleneck. I myself am part of the problem since I am mostly just working on llama.cpp/ggml as a hobby in my free time when I feel like it and my motivation to do things myself is simply much higher than my motivation to review the things that other people did. |
That I'm happy to do! 👍 |
We can still use llama-finetune right? |
In short: for now, use PEFT python if you want to finetune a model. |
can PEFT python train a gguf model? |
I don't know, but it's always easier with base model from HF (transformers library) Some people want to use Also, |
I don't even have the words for all feels I'm feeling right now over this. I've been taking a ton of time attempting to learn Vulkan on the side so I could integrate it into this and it's just removed because no one updated it. That's just sad. IMHO, this has so much more value than converting models and has so much more potential in the long term. This is a short-sighted removal. |
I've been playing around with the code all evening and I have been unable to reproduce the reported reason for the removal. The only thing I did notice was that the updates broke compatibility with the older pre-trained models. Other than that, it's working fine with the latest master branch. I'm currently pre-training a model with one of my custom datasets and it's operating as expected. Screencast.from.2024-07-26.21-55-50.webm |
I can confirm at least for llama2 model, the finetune function works pretty well |
…#8669) * examples : remove finetune and train-text-from-scratch * fix build * update help message * fix small typo for export-lora
That's not cool! I used the “finetune” quite often. I can't believe it's being removed now. I also used the “train-from-scratch” a lot for educational purposes (for myself). Thank you @JohannesGaessler for all the work you have done and for the additional knowledge I have been able to gather thanks to your work (finetune and train from scratch). i have spent many valuable hours with these two programs. |
It's not me that made |
Oh, where did I get that assumption from? Well okay, there's no harm in thanking someone “for no reason” :D |
just to clarify. with "no reason" i meant in this narrow context. at least i know you are one of the main devs in this project. |
I'm attempting to work on it. I ran into a weird issue with it with one of the latest commits. Still haven't narrowed it down, though. short outputggml_vk_create_queue()
ggml_vulkan memory: ggml_backend_vk_host_buffer_type_alloc_buffer(243539968)
ggml_vulkan memory: ggml_vk_host_malloc(243540000)
ggml_vk_create_buffer(AMD Radeon RX 7600 XT (RADV NAVI33), 243540000, { HostVisible | HostCoherent | HostCached }, { HostVisible | HostCoherent })
print_params: n_vocab: 32768
print_params: n_ctx: 256
print_params: n_embd: 256
print_params: n_head: 8
print_params: n_ff: 768
print_params: n_layer: 16
print_params: n_rot: 32
main: total train_iterations 0
main: seen train_samples 0
main: seen train_tokens 0
main: completed train_epochs 0
main: model_size = 243648192 bytes (232.4 MB)
main: opt_size = 365006976 bytes (348.1 MB)
main: opt iter 0
ggml_vk_get_device(0)
ggml_vulkan memory: ggml_backend_vk_host_buffer_type_alloc_buffer(536887296)
ggml_vulkan memory: ggml_vk_host_malloc(536887328)
ggml_vk_create_buffer(AMD Radeon RX 7600 XT (RADV NAVI33), 536887328, { HostVisible | HostCoherent | HostCached }, { HostVisible | HostCoherent })
main: input_size = 536887328 bytes (512.0 MB)
ggml_vk_get_device(0)
/mnt/valerie/forked/ggerganov/llama.cpp/ggml/src/ggml.c:17147: GGML_ASSERT(replacements->set.keys[k] == NULL) failed
ptrace: Operation not permitted.
No stack.
The program is not being run.
[1] 152976 IOT instruction (core dumped) ./build/bin/llama-train-text-from-scratch --vocab-model --ctx 256 --embd 256 I find it ironic that I finally figured this out and this happens right around the timing of this PR. lol. I have to laugh. Otherwise, I'll cry. All those hours spent towards getting here. full output21:20:25 | /mnt/valerie/forked/ggerganov/llama.cpp
(.venv) git:(mistral.cpp | Δ) λ ./build/bin/llama-train-text-from-scratch \
--vocab-model models/ggml-vocab-mistral.gguf \
--ctx 256 --embd 256 --head 8 --layer 16 \
--checkpoint-in /mnt/valerie/models/teleprint-me/valerie/v0.3/chk-valerie-v0.3-256x16-LATEST.gguf \
--checkpoint-out /mnt/valerie/models/teleprint-me/valerie/v0.3/chk-valerie-v0.3-256x16-ITERATION.gguf \
--model-out /mnt/valerie/models/teleprint-me/valerie/v0.3/ggml-valerie-v0.3-256x16-f16-ITERATION.gguf \
--train-data "/mnt/valerie/datasets/valerie/cyberpunk/wiki/wiki-combined.md" \
-t 16 -b 16 --seed 1 --adam-iter 2000 \
--save-every 250 --n-gpu-layers 17
main: seed: 1
llama_model_loader: loaded meta data with 32 key-value pairs and 0 tensors from models/ggml-vocab-mistral.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = llama
llama_model_loader: - kv 1: general.type str = model
llama_model_loader: - kv 2: general.name str = Mistral 7B Instruct v0.3
llama_model_loader: - kv 3: general.version str = v0.3
llama_model_loader: - kv 4: general.finetune str = Instruct
llama_model_loader: - kv 5: general.basename str = Mistral
llama_model_loader: - kv 6: general.size_label str = 7B
llama_model_loader: - kv 7: general.license str = apache-2.0
llama_model_loader: - kv 8: llama.block_count u32 = 32
llama_model_loader: - kv 9: llama.context_length u32 = 32768
llama_model_loader: - kv 10: llama.embedding_length u32 = 4096
llama_model_loader: - kv 11: llama.feed_forward_length u32 = 14336
llama_model_loader: - kv 12: llama.attention.head_count u32 = 32
llama_model_loader: - kv 13: llama.attention.head_count_kv u32 = 8
llama_model_loader: - kv 14: llama.rope.freq_base f32 = 1000000.000000
llama_model_loader: - kv 15: llama.attention.layer_norm_rms_epsilon f32 = 0.000010
llama_model_loader: - kv 16: general.file_type u32 = 1
llama_model_loader: - kv 17: llama.vocab_size u32 = 32768
llama_model_loader: - kv 18: llama.rope.dimension_count u32 = 128
llama_model_loader: - kv 19: tokenizer.ggml.add_space_prefix bool = true
llama_model_loader: - kv 20: tokenizer.ggml.model str = llama
llama_model_loader: - kv 21: tokenizer.ggml.pre str = default
llama_model_loader: - kv 22: tokenizer.ggml.tokens arr[str,32768] = ["<unk>", "<s>", "</s>", "[INST]", "[...
llama_model_loader: - kv 23: tokenizer.ggml.scores arr[f32,32768] = [-1000.000000, -1000.000000, -1000.00...
llama_model_loader: - kv 24: tokenizer.ggml.token_type arr[i32,32768] = [3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, ...
llama_model_loader: - kv 25: tokenizer.ggml.bos_token_id u32 = 1
llama_model_loader: - kv 26: tokenizer.ggml.eos_token_id u32 = 2
llama_model_loader: - kv 27: tokenizer.ggml.unknown_token_id u32 = 0
llama_model_loader: - kv 28: tokenizer.ggml.add_bos_token bool = true
llama_model_loader: - kv 29: tokenizer.ggml.add_eos_token bool = false
llama_model_loader: - kv 30: tokenizer.chat_template str = {{ bos_token }}{% for message in mess...
llama_model_loader: - kv 31: general.quantization_version u32 = 2
llm_load_vocab: special tokens cache size = 771
llm_load_vocab: token to piece cache size = 0.1731 MB
llm_load_print_meta: format = GGUF V3 (latest)
llm_load_print_meta: arch = llama
llm_load_print_meta: vocab type = SPM
llm_load_print_meta: n_vocab = 32768
llm_load_print_meta: n_merges = 0
llm_load_print_meta: vocab_only = 1
llm_load_print_meta: model type = ?B
llm_load_print_meta: model ftype = all F32
llm_load_print_meta: model params = 0.00 K
llm_load_print_meta: model size = 0.00 MiB (-nan BPW)
llm_load_print_meta: general.name = Mistral 7B Instruct v0.3
llm_load_print_meta: BOS token = 1 '<s>'
llm_load_print_meta: EOS token = 2 '</s>'
llm_load_print_meta: UNK token = 0 '<unk>'
llm_load_print_meta: LF token = 781 '<0x0A>'
llm_load_print_meta: max token length = 48
llama_model_load: vocab only - skipping tensors
llama_new_context_with_model: n_ctx = 512
llama_new_context_with_model: n_batch = 512
llama_new_context_with_model: n_ubatch = 512
llama_new_context_with_model: flash_attn = 0
llama_new_context_with_model: freq_base = 0.0
llama_new_context_with_model: freq_scale = 1
main: init model
gguf_init_from_file: failed to open '/mnt/valerie/models/teleprint-me/valerie/v0.3/chk-valerie-v0.3-256x16-LATEST.gguf': 'No such file or directory'
ggml_vk_instance_init()
ggml_vulkan: Found 1 Vulkan devices:
ggml_vk_print_gpu_info(0)
Vulkan0: AMD Radeon RX 7600 XT (RADV NAVI33) (radv) | uma: 0 | fp16: 1 | warp size: 64
ggml_vk_get_device(0)
Initializing new vk_device
ggml_vk_find_queue_family_index()
ggml_vk_find_queue_family_index()
ggml_vk_create_queue()
ggml_vk_load_shaders(AMD Radeon RX 7600 XT (RADV NAVI33))
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_f32_l, main, 3, 56, (128,128,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_f32_m, main, 3, 56, (64,64,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_f32_s, main, 3, 56, (32,32,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_f32_aligned_l, main, 3, 56, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_f32_aligned_m, main, 3, 56, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_f32_aligned_s, main, 3, 56, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_f32_f16_l, main, 3, 56, (128,128,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_f32_f16_m, main, 3, 56, (64,64,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_f32_f16_s, main, 3, 56, (32,32,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_f32_f16_aligned_l, main, 3, 56, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_f32_f16_aligned_m, main, 3, 56, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_f32_f16_aligned_s, main, 3, 56, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_f16_l, main, 3, 56, (128,128,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_f16_m, main, 3, 56, (64,64,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_f16_s, main, 3, 56, (32,32,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_f16_aligned_l, main, 3, 56, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_f16_aligned_m, main, 3, 56, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_f16_aligned_s, main, 3, 56, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_f16_f32_l, main, 3, 56, (128,128,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_f16_f32_m, main, 3, 56, (64,64,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_f16_f32_s, main, 3, 56, (32,32,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_f16_f32_aligned_l, main, 3, 56, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_f16_f32_aligned_m, main, 3, 56, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_f16_f32_aligned_s, main, 3, 56, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q4_0_f32_l, main, 3, 56, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q4_0_f32_m, main, 3, 56, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q4_0_f32_s, main, 3, 56, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q4_0_f32_aligned_l, main, 3, 56, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q4_0_f32_aligned_m, main, 3, 56, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q4_0_f32_aligned_s, main, 3, 56, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q4_1_f32_l, main, 3, 56, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q4_1_f32_m, main, 3, 56, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q4_1_f32_s, main, 3, 56, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q4_1_f32_aligned_l, main, 3, 56, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q4_1_f32_aligned_m, main, 3, 56, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q4_1_f32_aligned_s, main, 3, 56, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q5_0_f32_l, main, 3, 56, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q5_0_f32_m, main, 3, 56, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q5_0_f32_s, main, 3, 56, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q5_0_f32_aligned_l, main, 3, 56, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q5_0_f32_aligned_m, main, 3, 56, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q5_0_f32_aligned_s, main, 3, 56, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q5_1_f32_l, main, 3, 56, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q5_1_f32_m, main, 3, 56, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q5_1_f32_s, main, 3, 56, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q5_1_f32_aligned_l, main, 3, 56, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q5_1_f32_aligned_m, main, 3, 56, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q5_1_f32_aligned_s, main, 3, 56, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q8_0_f32_l, main, 3, 56, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q8_0_f32_m, main, 3, 56, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q8_0_f32_s, main, 3, 56, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q8_0_f32_aligned_l, main, 3, 56, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q8_0_f32_aligned_m, main, 3, 56, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q8_0_f32_aligned_s, main, 3, 56, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q2_k_f32_l, main, 3, 56, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q2_k_f32_m, main, 3, 56, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q2_k_f32_s, main, 3, 56, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q2_k_f32_aligned_l, main, 3, 56, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q2_k_f32_aligned_m, main, 3, 56, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q2_k_f32_aligned_s, main, 3, 56, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q3_k_f32_l, main, 3, 56, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q3_k_f32_m, main, 3, 56, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q3_k_f32_s, main, 3, 56, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q3_k_f32_aligned_l, main, 3, 56, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q3_k_f32_aligned_m, main, 3, 56, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q3_k_f32_aligned_s, main, 3, 56, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q4_k_f32_l, main, 3, 56, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q4_k_f32_m, main, 3, 56, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q4_k_f32_s, main, 3, 56, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q4_k_f32_aligned_l, main, 3, 56, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q4_k_f32_aligned_m, main, 3, 56, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q4_k_f32_aligned_s, main, 3, 56, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q5_k_f32_l, main, 3, 56, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q5_k_f32_m, main, 3, 56, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q5_k_f32_s, main, 3, 56, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q5_k_f32_aligned_l, main, 3, 56, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q5_k_f32_aligned_m, main, 3, 56, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q5_k_f32_aligned_s, main, 3, 56, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q6_k_f32_l, main, 3, 56, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q6_k_f32_m, main, 3, 56, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q6_k_f32_s, main, 3, 56, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q6_k_f32_aligned_l, main, 3, 56, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q6_k_f32_aligned_m, main, 3, 56, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_q6_k_f32_aligned_s, main, 3, 56, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_iq4_nl_f32_l, main, 3, 56, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_iq4_nl_f32_m, main, 3, 56, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_iq4_nl_f32_s, main, 3, 56, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_iq4_nl_f32_aligned_l, main, 3, 56, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_iq4_nl_f32_aligned_m, main, 3, 56, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_iq4_nl_f32_aligned_s, main, 3, 56, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_f32_l, main, 4, 52, (128,128,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_f32_m, main, 4, 52, (64,64,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_f32_s, main, 4, 52, (32,32,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_f32_aligned_l, main, 4, 52, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_f32_aligned_m, main, 4, 52, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_f32_aligned_s, main, 4, 52, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_f16_l, main, 4, 52, (128,128,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_f16_m, main, 4, 52, (64,64,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_f16_s, main, 4, 52, (32,32,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_f16_aligned_l, main, 4, 52, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_f16_aligned_m, main, 4, 52, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_f16_aligned_s, main, 4, 52, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_f16_f32_l, main, 4, 52, (128,128,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_f16_f32_m, main, 4, 52, (64,64,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_f16_f32_s, main, 4, 52, (32,32,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_f16_f32_aligned_l, main, 4, 52, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_f16_f32_aligned_m, main, 4, 52, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_f16_f32_aligned_s, main, 4, 52, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q4_0_f32_l, main, 4, 52, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q4_0_f32_m, main, 4, 52, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q4_0_f32_s, main, 4, 52, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q4_0_f32_aligned_l, main, 4, 52, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q4_0_f32_aligned_m, main, 4, 52, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q4_0_f32_aligned_s, main, 4, 52, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q4_1_f32_l, main, 4, 52, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q4_1_f32_m, main, 4, 52, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q4_1_f32_s, main, 4, 52, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q4_1_f32_aligned_l, main, 4, 52, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q4_1_f32_aligned_m, main, 4, 52, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q4_1_f32_aligned_s, main, 4, 52, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q5_0_f32_l, main, 4, 52, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q5_0_f32_m, main, 4, 52, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q5_0_f32_s, main, 4, 52, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q5_0_f32_aligned_l, main, 4, 52, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q5_0_f32_aligned_m, main, 4, 52, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q5_0_f32_aligned_s, main, 4, 52, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q5_1_f32_l, main, 4, 52, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q5_1_f32_m, main, 4, 52, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q5_1_f32_s, main, 4, 52, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q5_1_f32_aligned_l, main, 4, 52, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q5_1_f32_aligned_m, main, 4, 52, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q5_1_f32_aligned_s, main, 4, 52, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q8_0_f32_l, main, 4, 52, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q8_0_f32_m, main, 4, 52, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q8_0_f32_s, main, 4, 52, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q8_0_f32_aligned_l, main, 4, 52, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q8_0_f32_aligned_m, main, 4, 52, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q8_0_f32_aligned_s, main, 4, 52, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q2_k_f32_l, main, 4, 52, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q2_k_f32_m, main, 4, 52, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q2_k_f32_s, main, 4, 52, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q2_k_f32_aligned_l, main, 4, 52, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q2_k_f32_aligned_m, main, 4, 52, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q2_k_f32_aligned_s, main, 4, 52, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q3_k_f32_l, main, 4, 52, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q3_k_f32_m, main, 4, 52, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q3_k_f32_s, main, 4, 52, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q3_k_f32_aligned_l, main, 4, 52, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q3_k_f32_aligned_m, main, 4, 52, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q3_k_f32_aligned_s, main, 4, 52, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q4_k_f32_l, main, 4, 52, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q4_k_f32_m, main, 4, 52, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q4_k_f32_s, main, 4, 52, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q4_k_f32_aligned_l, main, 4, 52, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q4_k_f32_aligned_m, main, 4, 52, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q4_k_f32_aligned_s, main, 4, 52, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q5_k_f32_l, main, 4, 52, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q5_k_f32_m, main, 4, 52, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q5_k_f32_s, main, 4, 52, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q5_k_f32_aligned_l, main, 4, 52, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q5_k_f32_aligned_m, main, 4, 52, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q5_k_f32_aligned_s, main, 4, 52, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q6_k_f32_l, main, 4, 52, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q6_k_f32_m, main, 4, 52, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q6_k_f32_s, main, 4, 52, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q6_k_f32_aligned_l, main, 4, 52, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q6_k_f32_aligned_m, main, 4, 52, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_q6_k_f32_aligned_s, main, 4, 52, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_iq4_nl_f32_l, main, 4, 52, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_iq4_nl_f32_m, main, 4, 52, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_iq4_nl_f32_s, main, 4, 52, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_iq4_nl_f32_aligned_l, main, 4, 52, (128,128,1), specialization_constants, 128)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_iq4_nl_f32_aligned_m, main, 4, 52, (64,64,1), specialization_constants, 64)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), matmul_id_iq4_nl_f32_aligned_s, main, 4, 52, (32,32,1), specialization_constants, 32)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), mul_mat_vec_f32_f32_f32, main, 3, 44, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), mul_mat_vec_f16_f32_f32, main, 3, 44, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), mul_mat_vec_q4_0_f32_f32, main, 3, 44, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), mul_mat_vec_q4_1_f32_f32, main, 3, 44, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), mul_mat_vec_q5_0_f32_f32, main, 3, 44, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), mul_mat_vec_q5_1_f32_f32, main, 3, 44, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), mul_mat_vec_q8_0_f32_f32, main, 3, 44, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), mul_mat_vec_q2_k_f32_f32, main, 3, 44, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), mul_mat_vec_q3_k_f32_f32, main, 3, 44, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), mul_mat_vec_q4_k_f32_f32, main, 3, 44, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), mul_mat_vec_q5_k_f32_f32, main, 3, 44, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), mul_mat_vec_q6_k_f32_f32, main, 3, 44, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), mul_mat_vec_iq4_nl_f32_f32, main, 3, 44, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), mul_mat_vec_f32_f16_f32, main, 3, 44, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), mul_mat_vec_f16_f16_f32, main, 3, 44, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), mul_mat_vec_q4_0_f16_f32, main, 3, 44, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), mul_mat_vec_q4_1_f16_f32, main, 3, 44, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), mul_mat_vec_q5_0_f16_f32, main, 3, 44, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), mul_mat_vec_q5_1_f16_f32, main, 3, 44, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), mul_mat_vec_q8_0_f16_f32, main, 3, 44, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), mul_mat_vec_q2_k_f16_f32, main, 3, 44, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), mul_mat_vec_q3_k_f16_f32, main, 3, 44, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), mul_mat_vec_q4_k_f16_f32, main, 3, 44, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), mul_mat_vec_q5_k_f16_f32, main, 3, 44, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), mul_mat_vec_q6_k_f16_f32, main, 3, 44, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), mul_mat_vec_iq4_nl_f16_f32, main, 3, 44, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), mul_mat_vec_id_f32_f32, main, 4, 36, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), mul_mat_vec_id_f16_f32, main, 4, 36, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), mul_mat_vec_id_q4_0_f32, main, 4, 36, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), mul_mat_vec_id_q4_1_f32, main, 4, 36, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), mul_mat_vec_id_q5_0_f32, main, 4, 36, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), mul_mat_vec_id_q5_1_f32, main, 4, 36, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), mul_mat_vec_id_q8_0_f32, main, 4, 36, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), mul_mat_vec_id_q2_k_f32, main, 4, 36, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), mul_mat_vec_id_q3_k_f32, main, 4, 36, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), mul_mat_vec_id_q4_k_f32, main, 4, 36, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), mul_mat_vec_id_q5_k_f32, main, 4, 36, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), mul_mat_vec_id_q6_k_f32, main, 4, 36, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), mul_mat_vec_id_iq4_nl_f32, main, 4, 36, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), f32_to_f16, main, 2, 20, (4096,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), dequant_q4_0, main, 2, 20, (4096,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), dequant_q4_1, main, 2, 20, (4096,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), dequant_q5_0, main, 2, 20, (4096,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), dequant_q5_1, main, 2, 20, (4096,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), dequant_q8_0, main, 2, 20, (4096,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), dequant_q2_k, main, 2, 20, (16384,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), dequant_q3_k, main, 2, 20, (16384,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), dequant_q4_k, main, 2, 20, (8192,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), dequant_q5_k, main, 2, 20, (16384,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), dequant_q6_k, main, 2, 20, (16384,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), dequant_iq4_nl, main, 2, 20, (4096,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), get_rows_f32, main, 3, 112, (512,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), get_rows_f16, main, 3, 112, (512,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), get_rows_q4_0, main, 3, 112, (1024,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), get_rows_q4_1, main, 3, 112, (1024,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), get_rows_q5_0, main, 3, 112, (1024,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), get_rows_q5_1, main, 3, 112, (1024,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), get_rows_q8_0, main, 3, 112, (1024,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), get_rows_iq4_nl, main, 3, 112, (1024,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), get_rows_f32_f32, main, 3, 112, (512,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), get_rows_f16_f32, main, 3, 112, (512,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), get_rows_q4_0_f32, main, 3, 112, (1024,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), get_rows_q4_1_f32, main, 3, 112, (1024,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), get_rows_q5_0_f32, main, 3, 112, (1024,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), get_rows_q5_1_f32, main, 3, 112, (1024,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), get_rows_q8_0_f32, main, 3, 112, (1024,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), get_rows_iq4_nl_f32, main, 3, 112, (1024,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), split_k_reduce, main, 2, 8, (256,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), mul_mat_vec_p021_f16_f32, main, 3, 24, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), mul_mat_vec_nc_f16_f32, main, 3, 28, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), norm_f32, main, 2, 16, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), rms_norm_f32, main, 2, 16, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), cpy_f32_f32, main, 2, 80, (512,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), cpy_f32_f16, main, 2, 80, (512,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), cpy_f16_f16, main, 2, 80, (512,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), add_f32, main, 3, 112, (512,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), mul_f32, main, 3, 112, (512,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), div_f32, main, 3, 112, (512,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), scale_f32, main, 2, 80, (512,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), sqr_f32, main, 2, 80, (512,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), clamp_f32, main, 2, 80, (512,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), gelu_f32, main, 2, 16, (512,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), silu_f32, main, 2, 16, (512,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), relu_f32, main, 2, 16, (512,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), diag_mask_inf_f32, main, 2, 12, (512,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), soft_max_f32, main, 3, 28, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), soft_max_f32_f16, main, 3, 28, (1,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), rope_norm_f32, main, 4, 44, (1,512,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), rope_norm_f16, main, 4, 44, (1,512,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), rope_neox_f32, main, 4, 44, (1,512,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), rope_neox_f16, main, 4, 44, (1,512,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), argsort_f32, main, 2, 12, (1024,1,1), specialization_constants, 1)
ggml_vk_create_pipeline(AMD Radeon RX 7600 XT (RADV NAVI33), sum_rows_f32, main, 2, 16, (1,1,1), specialization_constants, 1)
ggml_vk_create_queue()
ggml_vulkan memory: ggml_backend_vk_host_buffer_type_alloc_buffer(243539968)
ggml_vulkan memory: ggml_vk_host_malloc(243540000)
ggml_vk_create_buffer(AMD Radeon RX 7600 XT (RADV NAVI33), 243540000, { HostVisible | HostCoherent | HostCached }, { HostVisible | HostCoherent })
print_params: n_vocab: 32768
print_params: n_ctx: 256
print_params: n_embd: 256
print_params: n_head: 8
print_params: n_ff: 768
print_params: n_layer: 16
print_params: n_rot: 32
main: total train_iterations 0
main: seen train_samples 0
main: seen train_tokens 0
main: completed train_epochs 0
main: model_size = 243648192 bytes (232.4 MB)
main: opt_size = 365006976 bytes (348.1 MB)
main: opt iter 0
ggml_vk_get_device(0)
ggml_vulkan memory: ggml_backend_vk_host_buffer_type_alloc_buffer(536887296)
ggml_vulkan memory: ggml_vk_host_malloc(536887328)
ggml_vk_create_buffer(AMD Radeon RX 7600 XT (RADV NAVI33), 536887328, { HostVisible | HostCoherent | HostCached }, { HostVisible | HostCoherent })
main: input_size = 536887328 bytes (512.0 MB)
ggml_vk_get_device(0)
/mnt/valerie/forked/ggerganov/llama.cpp/ggml/src/ggml.c:17147: GGML_ASSERT(replacements->set.keys[k] == NULL) failed
ptrace: Operation not permitted.
No stack.
The program is not being run.
[1] 152976 IOT instruction (core dumped) ./build/bin/llama-train-text-from-scratch --vocab-model --ctx 256 --embd 256 If I can get this working and there's concensus, I can open a PR with the updates and upgrades to the train text from scratch. I haven't used the finetune yet, but I could take a peek at it too. I've been saying this for awhile, I am interested in training and fine-tuning with this framework. Lot of low hanging fruit here with this. I have so many ideas, but I'm one person with limited time and resources. Can only do so much at once. @mounta11n I believe @xaedes is the original author of the training and tuning code. It's in the history for the commits. If anyone is wondering, this is me offering to maintain it when I can. |
ptrace: Operation not permitted. It would probably help if you can enable ptrace on your system. Or if you can step through it in a debugger. Any update to your drivers or Vulkan stack in the same timeframe? |
You can enable debugging with the build and step through with cmake -B build -DCMAKE_BUILD_TYPE=Debug -DGGML_VULKAN=1 -DGGML_VULKAN_DEBUG=0 -DLLAMA_CURL=0 -DGGML_CCACHE=0
cmake --build build --config Debug -j 16 I haven't worked on it since this post, but the backtrace is simple enough. gdb -ex=run --args build/bin/llama-train-text-from-scratch \
--vocab-model models/ggml-vocab-mistral.gguf \
--ctx 256 --embd 256 --head 8 --layer 16 \
--checkpoint-in /mnt/valerie/models/teleprint-me/valerie/v0.3/chk-valerie-v0.3-256x16-LATEST.gguf \
--checkpoint-out /mnt/valerie/models/teleprint-me/valerie/v0.3/chk-valerie-v0.3-256x16-ITERATION.gguf \
--model-out /mnt/valerie/models/teleprint-me/valerie/v0.3/ggml-valerie-v0.3-256x16-f16-ITERATION.gguf \
--train-data "/mnt/valerie/datasets/valerie/cyberpunk/wiki/wiki-combined.md" \
-t 16 -b 8 --seed 1 --adam-iter 50 \
--save-every 10 --n-gpu-layers 17 Note that this issue has nothing to do with vulkan. It will trigger with a CPU only build. Running the backtrace with gdb is super simple. 0x00007ffff6ea53f4 in ?? () from /usr/lib/libc.so.6
(gdb) bt
#0 0x00007ffff6ea53f4 in ?? () from /usr/lib/libc.so.6
#1 0x00007ffff6e4c120 in raise () from /usr/lib/libc.so.6
#2 0x00007ffff6e334c3 in abort () from /usr/lib/libc.so.6
#3 0x00007ffff74abfec in ggml_abort (file=0x7ffff763e1a8 "/mnt/valerie/forked/ggerganov/llama.cpp/ggml/src/ggml.c", line=17147,
fmt=0x7ffff763e436 "GGML_ASSERT(%s) failed") at /mnt/valerie/forked/ggerganov/llama.cpp/ggml/src/ggml.c:207
#4 0x00007ffff74e598a in ggml_build_backward_gradient_checkpointing (ctx=0x7ffff78f46c8 <g_state+296>, gf=0x7fffe8f99030, gb=0x7fffe903a0c0,
gb_tmp=0x7fffe90db150, checkpoints=0x555555876e10, n_checkpoints=26) at /mnt/valerie/forked/ggerganov/llama.cpp/ggml/src/ggml.c:17147
#5 0x00005555555bb16a in llama_build_train_graphs (model=0x7fffffffd690, alloc=0x5555562a8870, ctx=0x7ffff78f46c8 <g_state+296>, gf=0x7fffe8f99030,
gb=0x7fffe903a0c0, gb_tmp=0x7fffe90db150, logits=0x7fffffffd420, tokens_input=0x55555584b4c0, targets=0x55555584b630, n_tokens=256, n_batch=8,
enable_flash_attn=false, enable_checkpointing=true, measure_only=true)
at /mnt/valerie/forked/ggerganov/llama.cpp/examples/train-text-from-scratch/train-text-from-scratch.cpp:402
#6 0x00005555555c07b9 in main (argc=31, argv=0x7fffffffda98)
at /mnt/valerie/forked/ggerganov/llama.cpp/examples/train-text-from-scratch/train-text-from-scratch.cpp:1117
(gdb) What I was able to discover was that commit 2b1f616 is where the breaking change occurs and that's because @slaren made some changes to the backend which seems to have affected the computational graph (the "dag"). I've done some cursory investigation into it, but put it on pause for a bit. I've been taking some time off due to stress and personal/financial issues (seems counter-intuitive, but I won't be much use to anyone, including myself, if I burn out). I'm also busy with gig work which is why I usually go MIA from time-to-time. I always check-in when I can, though. |
@ggerganov would you consider a PR if we fix/revamp finetune at least for llama2 and/or 3 ? |
Be aware that I'm currently working on training in general and that the API may change: ggerganov/ggml#949 |
I think given the ongoing work by @JohannesGaessler for general-purpose training capabilities it's better to not resurrect these examples yet and instead to wait for the new API and functionality to settle. Otherwise, we'll be dealing with too many conflicts and bug reports. |
Can I put in a large vote for the replacement finetuning example - I use this a lot and have just come back to do a new build to discover it's disappeared. Incidentally shouldn't a PR removing a major feature have a bit more documented justification than this - it just points to this thread which doesn't appear have any conclusive support for removing it as not everyone could replicate the stated issues... I particularly love that the recommended "alternative" is to "use PEFT python if you want to finetune a model." This seems to be a python library which doesn't even function without someone learning about how the library functions and scratch writing a python application around it. |
The new GGML training code is mostly functional, see ggerganov/ggml#988 . The ETA for re-adding llama.cpp training support is 1-2 months I think. |
To be honest, I lost complete interest once this was removed and saw that there was little interest in keeping it around. I've reviewed the code and have read the docs, specifications, and codebase as a whole and one thing that @ngxson got right is that it is extremely complicated. It's made even more complicated now due to the fact that the original training code was removed. I really appreciate the work that @JohannesGaessler is doing because it takes a certain level of dedication and persistence to keep pushing this forward. I understand why MNIST is being used, it's simpler. Much simpler. That's why my initial goal was to do an XOR model which is probably the simplest model you can create. Due to the level of complexity involved which I recognize is due to the vast amount of support for multiple models, CPU, and GPU architectures, I decided to do something else instead. I rewound to the earliest commit possible for llama.cpp and began working on the original code that @ggerganov implemented which is way easier to parse and understand because it's only for LLaMa 1. I have been working on this code for the past week and ended up creating my own model format in the process. I didn't originally intend to do this, but it made parsing the model file much simpler, coherent, and consistent without concerning myself with all of the complexity involved in the current code base. I realized that there are issues with the LLaMa 1 I chose Mistral for a variety of reasons as it uses a lot of modern features, the Once I get inference working on CPU, I'm going to implement a completely custom Vulkan only backend due to the fact that I need one in C for other purposes that I'm not ready to share. Even though @0cc4m has done amazing work with Vulkan backend, it's using C++ and they're already comfortable and familiar with the Vulkan API. After working on the fundementals in C, I was able to better understand the API as a result, so I think this path is useful for me as a learning experience. So, my overarching goal of training and fine-tuning a LLM from scratch remains the same. My rationale for this is so that I can learn the fundamentals as I progress. I've noticed every time I do this and return to the ggml and llama.cpp code, I better comprehend what's happening and why certain design choices were made. @slaren I am truly impressed with your work and how you handled the complexity so well. My overall hope is that I can provide more meaningful contributions in the future. As a result, once I get the basics up and running, I'm hoping to be able to export the custom model formats back to the modern GGML format so that they're compatible with inference. As of now, it's too early to share anything. I will respect the MIT license and will open source what I've done once it's ready. This will allow me to learn at my own pace. If anyone is genuinely interested in my approach (which is nothing new), I'll be willing to open source it earlier than I originally planned on. The reason I haven't shared is because I don't want to release something that causes issues for the developers here, e.g. causing unnecessary confusion. Sorry if this was long, I've had a rough go at it the past few years and this year has been especially difficult for me, and I tend to become a recluse because it's less stressful for me. I tend to be happier when I'm coding on my own and ignore the world around me. Gives me a chance to recharge if that makes any sense. As a teaser - mostly on a more positive note and for fun, here's what I got so far. 22:32:32 | /mnt/valerie/forked/ggerganov/llama.cpp
(.venv) git:(alt.cpp | Δ) λ python -m gguf.mistral-to-gguf -m models/mistral-1/7B -c "config.json" -t 1
Loading model part: pytorch_model-00001-of-00002.bin
Loading model part: pytorch_model-00002-of-00002.bin
FILE start marker starts at 0
FILE start marker ends at 4
Writing hyperparameters section...
Aligned offset with 28 bytes of padding.
FILE 0xdeadbeef starts at 32
FILE 0xdeadbeef ends at 40
FILE size of 68 starts at 40
FILE size of 68 ends at 48
Config start at 48
Config end at 112
Writing tokenizer section...
Aligned offset with 16 bytes of padding.
FILE 0xbaddcafe starts at 128
FILE 0xbaddcafe ends at 136
FILE size of 332690 starts at 136
FILE size of 332690 ends at 144
Tokenizer starts at 144
Tokenizer ends at 332834
Writing tensor section...
Aligned offset with 30 bytes of padding.
FILE 0xfacefeed starts at 332864
FILE 0xfacefeed ends at 332872
FILE size of 14483480854 starts at 332872
FILE size of 14483480854 ends at 332880
Tensor data starts at offset: 332880
Writing 291 tensors with a total of 517 shape elements
Processing model part 1
Processing model part 2
Tensor data ends at offset: 14483813718
Aligned offset with 10 bytes of padding.
FILE end marker starts at 14483813728
FILE end marker ends at 14483813732
Model successfully written to models/mistral-1/7B/ggml-model-f16.gguf And I'm able to read the model in Python and in C. 20:35:42 | /mnt/valerie/forked/ggerganov/llama.cpp
(.venv) git:(alt.cpp | Δ) λ python -m gguf.validator models/mistral-1/7B/ggml-model-f16.gguf
Opened file models/mistral-1/7B/ggml-model-f16.gguf
Reading start marker: 0
Start marker read successfully: 0x67676d6c
Reading section marker: 4
Aligned offset with 28 bytes of padding.
Section marker: 0xdeadbeef, Size: 68
Reading model parameters...
context_length: 8192, hidden_size: 4096, num_hidden_layers: 32, intermediate_size: 14336, num_attention_heads: 32, num_key_value_heads: 8, sliding_window: 4096, rope_theta: 10000.0, rms_norm_eps: 9.999999747378752e-06, head_size: 128, dtype: 1
Reading section marker: 112
Aligned offset with 16 bytes of padding.
Section marker: 0xbaddcafe, Size: 332690
Reading tokenizer...
Vocab size: 32000
Special tokens - BOS ID: 1, EOS ID: 2, PAD ID: -1, UNK ID: 0
Token ID 0: <unk> (length 5)
Token ID 1: <s> (length 3)
Token ID 2: </s> (length 4)
Token ID 3: <0x00> (length 6)
Token ID 4: <0x01> (length 6)
Token ID 31996: 執 (length 3)
Token ID 31997: 벨 (length 3)
Token ID 31998: ゼ (length 3)
Token ID 31999: 梦 (length 3)
...truncated for brevity
Reading section marker: 332834
Aligned offset with 30 bytes of padding.
Section marker: 0xfacefeed, Size: 14483480854
Reading tensor data...
Tensor count: 291, Shape count: 517
Tensor 0: model.embed_tokens.weight, Shape: [32000, 4096], Dtype: 1
Tensor 1: model.layers.0.self_attn.q_proj.weight, Shape: [4096, 4096], Dtype: 1
Tensor 2: model.layers.0.self_attn.k_proj.weight, Shape: [1024, 4096], Dtype: 1
Tensor 3: model.layers.0.self_attn.v_proj.weight, Shape: [1024, 4096], Dtype: 1
Tensor 4: model.layers.0.self_attn.o_proj.weight, Shape: [4096, 4096], Dtype: 1
Tensor 5: model.layers.0.mlp.gate_proj.weight, Shape: [14336, 4096], Dtype: 1
Tensor 6: model.layers.0.mlp.up_proj.weight, Shape: [14336, 4096], Dtype: 1
Tensor 7: model.layers.0.mlp.down_proj.weight, Shape: [4096, 14336], Dtype: 1
Tensor 8: model.layers.0.input_layernorm.weight, Shape: [4096], Dtype: 1
Tensor 9: model.layers.0.post_attention_layernorm.weight, Shape: [4096], Dtype: 1
Tensor 282: model.layers.31.self_attn.v_proj.weight, Shape: [1024, 4096], Dtype: 1
Tensor 283: model.layers.31.self_attn.o_proj.weight, Shape: [4096, 4096], Dtype: 1
Tensor 284: model.layers.31.mlp.gate_proj.weight, Shape: [14336, 4096], Dtype: 1
Tensor 285: model.layers.31.mlp.up_proj.weight, Shape: [14336, 4096], Dtype: 1
Tensor 286: model.layers.31.mlp.down_proj.weight, Shape: [4096, 14336], Dtype: 1
Tensor 287: model.layers.31.input_layernorm.weight, Shape: [4096], Dtype: 1
Tensor 288: model.layers.31.post_attention_layernorm.weight, Shape: [4096], Dtype: 1
Tensor 289: model.norm.weight, Shape: [4096], Dtype: 1
Tensor 290: lm_head.weight, Shape: [32000, 4096], Dtype: 1
...truncated for brevity
Reading end marker: 14483813718
Aligned offset with 10 bytes of padding.
End marker read successfully: 0xfffffff
File closed. Obviously, the 7B is massive, but I think it could theoretically be fine-tuned in 8-bit format, but I expect some issues to pop up once I get there. It's nice because the context window is smaller, 8192 according to the paper. I know my 16GB GPU should be able to handle this, in theory. If I succeed, this should have a positive impact for those of us with tighter budgets and less compute availability. |
Ref:
These examples are no longer working and require too much efforts to maintain. Therefore, they need to be removed.
It's always sad to say goodbye, but we need to move on... (let's hope that we can bring it back one day)
NOTE: This PR also contains a small correction for
export-lora/README