Provide .GGUF files? #7

flatsiedatsie · 2024-05-09T06:04:40Z

Would it be possible to provice a full range of GGUF files for these wicked models?

I'm tyring to convert the 3B myself, but running into issues.

mayank31398 · 2024-05-09T06:46:51Z

@flatsiedatsie some folks in the community have already started working on this
relevant discussion: ggerganov/llama.cpp#7116

YorkieDev · 2024-05-09T16:28:56Z

@flatsiedatsie @mayank31398

I've made a couple of GGUF's here:

https://huggingface.co/YorkieOH10/granite-8b-code-instruct-Q8_0-GGUF
https://huggingface.co/YorkieOH10/granite-8b-code-instruct-Q4_K_M-GGUF
https://huggingface.co/YorkieOH10/granite-34b-code-instruct-Q8_0-GGUF

But waiting on support in llama.cpp before they can be used.

mayank31398 · 2024-05-09T17:19:08Z

Thanks @YorkieDev
I am not very familiar with GGUF but whats the difference between Q8_0 and Q4_K_M ?
Also, is this the required PR to merge? ggerganov/llama.cpp#7116

mayank31398 · 2024-05-09T17:21:03Z

Also, not sure if its hard, but would be awesome if you can create GGUFs of all the models

YorkieDev · 2024-05-09T17:32:47Z

@mayank31398 That's correct, once that PR's been merged GGUF's will work. I can get some more quants made up of them.

From my understanding, it's a scale of quality/size Q8 is heavier to run, but Q4 is lighter, and is what the majority of folks use.

For more info on GGUF models: https://huggingface.co/docs/hub/gguf

mayank31398 · 2024-05-10T01:33:16Z

thanks for the explanation.

flatsiedatsie · 2024-05-10T20:17:48Z

Having a wider range of quants would rock.

For example, I would like to use Granite in a browser-based project. In that use case it's optimal when a file it below 2Gb in size, and the Q4 quant you provided is just above that threshold.

mayank31398 · 2024-06-02T17:34:29Z

GGUFs are now available: https://huggingface.co/collections/ibm-granite/granite-code-models-6624c5cec322e4c148c8b330

mayank31398 closed this as completed Jun 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide .GGUF files? #7

Provide .GGUF files? #7

flatsiedatsie commented May 9, 2024

mayank31398 commented May 9, 2024

YorkieDev commented May 9, 2024

mayank31398 commented May 9, 2024

mayank31398 commented May 9, 2024 •

edited

Loading

YorkieDev commented May 9, 2024

mayank31398 commented May 10, 2024

flatsiedatsie commented May 10, 2024

mayank31398 commented Jun 2, 2024

Provide .GGUF files? #7

Provide .GGUF files? #7

Comments

flatsiedatsie commented May 9, 2024

mayank31398 commented May 9, 2024

YorkieDev commented May 9, 2024

mayank31398 commented May 9, 2024

mayank31398 commented May 9, 2024 • edited Loading

YorkieDev commented May 9, 2024

mayank31398 commented May 10, 2024

flatsiedatsie commented May 10, 2024

mayank31398 commented Jun 2, 2024

mayank31398 commented May 9, 2024 •

edited

Loading