-
Notifications
You must be signed in to change notification settings - Fork 10.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gguf-py : fix some metadata name extraction edge cases #8591
Conversation
* convert_lora : use the lora dir for the model card path
Multiple finetune versions are now joined together, and the removal of the basename annotation on trailing versions is more robust.
@mofosyne Regarding the use of title case for the llama.cpp/gguf-py/gguf/utility.py Line 61 in 87e397d
I think it looks a bit weird for a few models I've recently tried converting:
Should the original case be preserved for Do you think I should change this in this PR or another one? (out of scope, but still want to comment about this) Also note that the naming convention at https://github.com/ggerganov/ggml/blob/master/docs/gguf.md#gguf-naming-convention should be updated to reflect the format used in #7499 (especially the Also I don't think we're close to having any model with Quadrillions of parameter so I suggest removing this, especially since there is no example of what suffix is normally used for that case. (might be Let me know what you think about this. |
@compilade well if it's an edge case then it's part of the PR, depending on how urget this fix needs to get in. Is it breaking anyone's workflow currently, if not then we can take our time making it better. I don't think there is any one right answer here... but we can assume anything read to the kv store is of the correct capitalisation... as for anything heuristically obtained... it's the wild west. |
@compilade and yes the gguf doc is outdated and from what I understand the consensus is that code is the main spec, so our implementation supersedes the document. We should update it. Also I don't think the heuristics approach we have is a spec, it's a nice to have feature in the converter python. But the gguf spec for naming should contain only what we are aiming for people to stick to. As for the Quadrillion, I'll suggest keeping it in... I would generally like to have +1 ahead of whatever we can immediately conceptionalise in the near future. And the near future is likely Trillion so +1 would be quadrillion. As per name of large number (wikipedia) |
I think a quadrillion can safely be 1000T instead. Personally I think we're quite far from making models with 10 times more parameters than the number of neurons in a human brain.
Totally agree. The heuristics are there to turn non-compliant but commonly used names into compliant names. The spec only needs to deal with the resulting names, not the mess they come from. |
Well people in the past thought that 640K is ought to be enough for anyone and I like to be optimistic for the development of humanity that we can create something like that in the future. Not going to fight to hard to keep it, but I'll rather it stay.
Glad we are on the same page... we just gotta remember to do it... So... will merge soon if no objection to this PR pops up |
Oh noticed there is this todo
Not familiar with this so will hold off and let @compilade handle it |
The default filename was previously hardcoded. * convert_hf : Model.fname_out can no longer be None
Some models use acronyms in lowercase, which can't be title-cased like other words, so it's best to simply use the same case as in the original model name. Note that the size label still has an uppercased suffix to make it distinguishable from the context size of a finetune.
I've tested LoRA generation, and it seems fine. Here's some
There's still some title-casing of the metadata which I didn't remove, but at least it's easy to see that the metadata from the LoRA adapter's model card was correctly used to link to the base models (even if the complete links are truncated in the dump output). I don't think the hparams and the tokenizer are really used in GGUF LoRA adapters though, but that's out of the scope of this PR. I consider this merge ready, and will merge after #8597. |
@mofosyne, I recently tried converting https://huggingface.co/HuggingFaceTB/SmolLM-135M-Instruct, and got surprised that the default name chosen for I think maybe https://huggingface.co/HuggingFaceTB/SmolLM-135M-Instruct: "_name_or_path": "HuggingFaceTB/cosmo2-135M-webinst-sc2", https://huggingface.co/HuggingFaceTB/SmolLM-135M: "_name_or_path": "/fsx/elie_bakouch/checkpoints/final-149M/600000", |
@compilade urgh, perhaps we should remove _name_or_path from the heuristics. edit: Also I see people in https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard/discussions/761#669e1119f073a1a9276f7f90 discussing pulling (On the other hand... there can be an argument for a metadata.json file... that I've haven't thought of yet) |
* gguf-py : fix some metadata name extraction edge cases * convert_lora : use the lora dir for the model card path * gguf-py : more metadata edge cases fixes Multiple finetune versions are now joined together, and the removal of the basename annotation on trailing versions is more robust. * gguf-py : add more name metadata extraction tests * convert_lora : fix default filename The default filename was previously hardcoded. * convert_hf : Model.fname_out can no longer be None * gguf-py : do not use title case for naming convention Some models use acronyms in lowercase, which can't be title-cased like other words, so it's best to simply use the same case as in the original model name. Note that the size label still has an uppercased suffix to make it distinguishable from the context size of a finetune.
Yes, using the model card metadata for overrides seems better than a But there are 2 model cards to consider: the original model card, and the model card of the converted model, which I guess could be taken from the directory where the GGUF model is exported, but some people use a single directory for all their conversions, which might be problematic for fields like the Something else which could be useful is to offer an option to generate an initial model card for GGUF conversions based on the original model card with some extra metadata deduced from the model and/or update the metadata of an existing model card. |
Should fix the problem mentioned in #8579 (comment) by @maziyarpanahi, which was caused by #7499 not pruning the empty parts of a name.
I've also made
convert_lora_to_gguf.py
use the LoRA adapter directory for the model card path.TODO
convert_lora_to_gguf.py
and check if the metadata is correct.