Fix GLM4 alignment issue #2723

guoqingbao · 2025-01-17T10:02:22Z

This PR primarily addresses the generation issue in GLM4, specifically the missing rope_ratio in the construction of the cosine-sine cache. Additionally, it enables the configuration to be loaded from a JSON file and allows weights to be loaded from a local path, facilitated by a new utility function named hub_load_local_safetensors. Furthermore, it resolves a compilation issue caused by the updated hf-hub crate in the previous #2691, which now depends on a newer version of the tokio package.

Tested case:

cargo run --release --example glm4 --features cuda -- --weight-path /home/weights/glm-4-9b-chat/ --prompt "Please talk about deep learning."

It now generates better answers (align with the official results).

LaurentMazare · 2025-01-20T21:52:07Z

Thanks, nice to have that fixed and in line with the official implementation!

guoqingbao and others added 2 commits January 17, 2025 09:47

Fix GLM4 alignment issue

a1f6c98

Cleanups.

cf35e4c

LaurentMazare merged commit e4c3a71 into huggingface:main Jan 20, 2025
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix GLM4 alignment issue #2723

Fix GLM4 alignment issue #2723

guoqingbao commented Jan 17, 2025

LaurentMazare commented Jan 20, 2025

Fix GLM4 alignment issue #2723

Fix GLM4 alignment issue #2723

Conversation

guoqingbao commented Jan 17, 2025

LaurentMazare commented Jan 20, 2025