-
Notifications
You must be signed in to change notification settings - Fork 10.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor: Formalise Keys.General GGUF KV Store #7836
Comments
I think it's not clear enough what exactly I think And Also, |
Hopefully this explains my mental model: flowchart LR
B1[Base Model 0] -- merged or finetuned --> source
B2[Base Model 1] -- merged or finetuned --> source
source[Source] --converted to gguf--> model[Main Model]
flowchart LR
B1[general.base_model.0.*] -- merged or finetuned --> source
B2[general.base_model.1.*] -- merged or finetuned --> source
source[general.source.*] --converted to gguf--> model[general.*]
So basically And if not converted but directly generated into a gguf file: flowchart LR
B1[Base Model 0] -- merged or finetuned --> model
B2[Base Model 1] -- merged or finetuned --> model
model[Main Model]
Or if just a straight up new base model safetensor etc... but converted to gguf flowchart LR
source[Source] --converted to gguf--> model[Main Model]
You think so? I think while they share the same weights, it a different digital object (due to possible loss of accuracy during the qant process).
Yeah my intent is that url refers to "homepages, paper" while the "repo_url" is primarily for the "code, weights"
Yeah you got the idea why I added the extra two fields, multiple people were using 'other' in place of license and putting the actual license in the two extra separate field. Add support for integrated model card?Should we bake support for model cards to be copied verbatim into the model? Issue would be the tendency for model card writers to put external image links etc... so copying the model card markdown content might not be a good idea. But if we do then these are the proposed additions. The difference between our KV GGUF storage and the model card format is that while the KV keyvalue store is quite ridge, the model card metadata is quite freeform which has it's own advantage (but clashes with the need of the KV store). So it might be worth copying the model card content in but keeping it somewhat separate. Below is some of the possible fields we may want to include under
How this would work with say a 'model browser' is that it would allow the user to read the model card in an offline manner e.g. a file browser would show the model card to the side when the model file is selected. |
I'd like to propose inclusion of TLDR:
|
@Galunid interesting. Could you also suggest a
Also I thought gguf file format structure already have versioning? gguf file structure image |
I'd propose
or something like that. I'd put them under To be clear these are just some ideas that I feel make sense, but I'm very much open to critique. |
As for version, it's true that |
An unfortunate effect of including the git commit hash of |
I was thinking of adding a flag that could prevent that, or writing script like |
I'd be very interested in this. Both non-interactively and interactively, to investigate differences. At first for metadata and yes/no for tensor data. But then comparisons would be useful for tensor data, like when comparing quantization (re)implementations. It would be nice to output hex samples of differring data granulated on the block size of the quant formats. I'm not sure about cross-quantization diffing, though. But no need for all of that at first (since it would add quite a bit of complexity).
Both being able to set it to a known commit hash and, at least at first, being able to prevent adding the commit hash would be necessary. I've been thinking... If metadata is something that should be easily changeable in GGUF models without rewriting all the tensor data, would it be more convenient if the GGUF metadata KV were at the end of the files instead? For backward compatibility, instead of changing too deeply the GGUF format, there could even be a normal GGUF metadata key at the beginning (along with some others) which would store an offset for the extended metadata key-value pairs. Or that offset could be calculated from the tensor info section. Might not be worth it, but this would allow easier metadata additions even to existing GGUF models. |
@compilade you might be interested that #7839 I was trying for a stab at Hashing the weights of the models to generate a uuid. Obviously my approach was an attempt at having a hash that would ignore quantisation, but anyway I hope that sparks some idea. What are you trying to compare when diffing models? Difference in metadata? Difference in layers? (E.g. mismatch in layer X)? |
#7499 merged in. May need to update gguf documentation and add to wiki how to use the metadata override feature |
ggml-org/ggml#897 documentation update PR created and waiting to be merged in |
Background Description
During #7499 it turns out that the KV store metadata needs further development.
We need an outline and a consensus on that outline for the KV store in such a way that is not too closely coupled with HuggingFace and is independent enough to service GGUF use cases. Ideally we should be able to remotely fetched all the details as needed or use the model card as fallback.
Below is a stab I had in listing out what Keys i thought about as well as possible hugging face model card key I could use.
Possible Refactor Approaches
No response
The text was updated successfully, but these errors were encountered: