Feature Request: GLM-4 9B Support #7778

arch-btw · 2024-06-05T20:12:10Z

Prerequisites

I am running the latest code. Mention the version if possible as well.
I carefully followed the README.md.
I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
I reviewed the Discussions, and have a new and useful enhancement to share.

Feature Description

It would be really cool to have support for these models that were released today. They have some very impressive benchmarks. I've also been trying out the model in huggingface spaces myself and noticed it speaks a lot of languages fluently and is knowledgeable on many topics. Thank you for your time.

Here are the download links:

Here is the English README: README_en.md

Motivation

The motivation for this feature are found in some of the technical highlights for this model:

These models were trained on 10T tokens.
GLM-4-9B-Chat models have 9B parameters.
GLM-4-9B-Chat-1M model supports 1M context length and scored 100% on the needle in haystack challenge.
GLM-4-9B models support 26 languages.
Has a vision model (glm-4v-9b).
Early impressions are impressive.

Here are some of the results:

Needle challenge:

Longbench:

Possible Implementation

We might be able to use some of the code from: #6999.

There is also chatglm.cpp but it doesn't support GLM-4.

foldl · 2024-06-07T05:43:12Z

You can try chatllm.cpp, which supports GLM-4.

jamfor352 · 2024-06-08T10:40:42Z

You can try chatllm.cpp, which supports GLM-4.

Can confirm this works and is cool 😎

It would be good to get this functionality in Llama.cpp too, if only for the GPU acceleration

ELigoP · 2024-06-08T19:33:42Z

You can try chatllm.cpp, which supports GLM-4.

Well, it chatllm.cpp is CPU-only. Why not trying transformers version in fp16.

llama.cpp GPU support for GLM-4 would be great, and then quantized versions will appear, which will be even more comfortable to run.

This GLM-4 looks like comparable or beating LLama 3, maybe even best-in-class for now.

matteoserva · 2024-06-20T10:02:55Z

We might have this feature soon: #8031

github-actions · 2024-08-05T01:06:53Z

This issue was closed because it has been inactive for 14 days since being marked as stale.

yukiarimo · 2024-09-01T21:01:40Z

Any updates?

yukiarimo · 2024-09-01T21:02:54Z

I saw it's merged, but does it works with llama-cop-python and how to get vision stuff working in gguf?

arch-btw added the enhancement New feature or request label Jun 5, 2024

arch-btw mentioned this issue Jun 13, 2024

add chatglm3-6b and glm-4-9b-chat model support #6999

Closed

github-actions bot added the stale label Jul 21, 2024

github-actions bot closed this as completed Aug 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: GLM-4 9B Support #7778

Feature Request: GLM-4 9B Support #7778

arch-btw commented Jun 5, 2024

foldl commented Jun 7, 2024

jamfor352 commented Jun 8, 2024

ELigoP commented Jun 8, 2024

matteoserva commented Jun 20, 2024

github-actions bot commented Aug 5, 2024

yukiarimo commented Sep 1, 2024

yukiarimo commented Sep 1, 2024

Feature Request: GLM-4 9B Support #7778

Feature Request: GLM-4 9B Support #7778

Comments

arch-btw commented Jun 5, 2024

Prerequisites

Feature Description

Motivation

Possible Implementation

foldl commented Jun 7, 2024

jamfor352 commented Jun 8, 2024

ELigoP commented Jun 8, 2024

matteoserva commented Jun 20, 2024

github-actions bot commented Aug 5, 2024

yukiarimo commented Sep 1, 2024

yukiarimo commented Sep 1, 2024