glm-4-9b-chat量化bin模型(精度q8_0)调用function calling失败 #345

rustjiao · 2024-09-10T09:31:32Z

>首先是对glm3和glm4模型做量化，我下载并使用glm-3-6b-chat和glm-4-9b-chat完整的模型做量化：(量化精度都是q8_0)
chatglm.cpp# python3 chatglm_cpp/convert.py -i /glm-3-6b-chat/ -t q8_0 -o models/chatglm3-q8_0-ggml.bin
chatglm.cpp# python3 chatglm_cpp/convert.py -i /glm-4-9b-chat/ -t q8_0 -o models/chatglm4-q8_0-ggml.bin

>然后分别使用两个量化模型，调用function calling的功能，在此参考作者官方的cli_demo.py
先是测试glm3的量化模型：
/chatglm.cpp/examples# python3 cli_demo.py -m /chatglm.cpp/models/chatglm3-q8_0-ggml.bin --temp 0.1 --top_p 0.8 --sp system/function_call.txt -i

>在Prompt中输入："请帮我查询一下今天苏州的天气"，打印出了tool_call，说明调用了function calling的功能：

Prompt  > 请帮我查询一下苏州今天的天气
ChatGLM3 > get_weather
tool_call(city_name='苏州')

>在cli_demo.py中的120行上面添加打印responses的代码：
prompt_image = image
    while True:
        print("--------------------------------------------------------------")
        print(messages)
        if messages and messages[-1].tool_calls:
            (tool_call,) = messages[-1].tool_calls
            if tool_call.type == "function":
                print(
                    f"Function Call > Please manually call function `{tool_call.function.name}` and provide the results below."
                )
                input_prompt = "Observation   > "
            elif tool_call.type == "code":
                print(f"Code Interpreter > Please manually run the code and provide the results below.")
                input_prompt = "Observation      > "
            else:
                raise ValueError(f"unexpected tool call type {tool_call.type}")
            role = "observation"
        else:
            input_prompt = f"{'Prompt':{prompt_width}} > "
            role = "user"

>则在运行时，打印出的responses中的role="assistant"字段中的tool_calls是完全调用了get_weather函数的：
>[ChatMessage(role="system", content="Answer the following questions as best as you can. You have access to the following tools:
{
    \"random_number_generator\": {
        \"name\": \"random_number_generator\",
        \"description\": \"Generates a random number x, s.t. range[0] <= x < range[1]\",
        \"params\": [
            {
                \"name\": \"seed\",
                \"description\": \"The random seed used by the generator\",
                \"type\": \"int\",
                \"required\": true
            },
            {
                \"name\": \"range\",
                \"description\": \"The range of the generated numbers\",
                \"type\": \"tuple[int, int]\",
                \"required\": true
            }
        ]
    },
    \"get_weather\": {
        \"name\": \"get_weather\",
        \"description\": \"Get the current weather for `city_name`\",
        \"params\": [
            {
                \"name\": \"city_name\",
                \"description\": \"The name of the city to be queried\",
                \"type\": \"str\",
                \"required\": true
            }
        ]
    }
}", tool_calls=[]), ChatMessage(role="user", content="请帮我查询苏州今天的天气", tool_calls=[]), ChatMessage(role="assistant", content="```python
tool_call(city_name='苏州')
```", tool_calls=[ToolCallMessage(type="function", function=FunctionMessage(name="get_weather", arguments="tool_call(city_name='苏州')"), code=CodeMessage(input=""))])]

>下面同样的命令，测试glm4的function calling调用结果
chatglm.cpp/examples# python3 cli_demo.py -m /chatglm.cpp/models/chatglm4-q8_0-ggml.bin --temp 0.1 --top_p 0.8 --sp system/function_call.txt

>同样在Prompt中输入查询苏州天气的字段，则打印：

Prompt   > 请帮我查询一下今天苏州的天气
ChatGLM4 > get_weather
{"city_name": "苏州"}

可以看出没有tool_call的字段，说明没有调用function calling的功能

>而打印出responses查看的话，如下：
[ChatMessage(role="system", content="Answer the following questions as best as you can. You have access to the following tools:
{
    \"random_number_generator\": {
        \"name\": \"random_number_generator\",
        \"description\": \"Generates a random number x, s.t. range[0] <= x < range[1]\",
        \"params\": [
            {
                \"name\": \"seed\",
                \"description\": \"The random seed used by the generator\",
                \"type\": \"int\",
                \"required\": true
            },
            {
                \"name\": \"range\",
                \"description\": \"The range of the generated numbers\",
                \"type\": \"tuple[int, int]\",
                \"required\": true
            }
        ]
    },
    \"get_weather\": {
        \"name\": \"get_weather\",
        \"description\": \"Get the current weather for `city_name`\",
        \"params\": [
            {
                \"name\": \"city_name\",
                \"description\": \"The name of the city to be queried\",
                \"type\": \"str\",
                \"required\": true
            }
        ]
    }
}", tool_calls=[]), ChatMessage(role="user", content="请帮我查询一下今天苏州的天气", tool_calls=[]), ChatMessage(role="assistant", content="get_weather
{\"city_name\": \"苏州\"}", tool_calls=[])]

>可以看到role="assistant"字段里面的tool_calls为空，则说明调用function calling失败，虽然content中的内容看起来很对
>不知道这个是chatglm.cpp作者自己的疏忽，还是什么原因，我换了glm-4-9b-chat-1m基座模型和glm-4-9b的base模式的基座模型去做量化，调用function calling的功能也是这样的结果
>请问有什么方法可以解决吗？

zutuanwang · 2024-11-13T07:18:27Z

遇到了同样的问题，即glm-4-9b-chat模型q4_0/q8_0量化后，返回的tool_calls字段为空，大佬们能帮忙看看怎么解决吗
启动OpenAI API
cd examples
MODEL=../models/glm4-9b-ggml-q80.bin uvicorn chatglm_cpp.openai_api:app --host 127.0.0.1 --port 8000

使用openai_client请求
python3 examples/openai_client.py --base_url http://127.0.0.1:8000/v1 --tool_call --prompt 上海天气怎么样

返回tool_calls为空

ChatCompletion(id='chatcmpl', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='get_current_weather\n{"location": "Shanghai, China", "unit": "celsius"}', refusal=None, role='assistant', audio=None, function_call=None, tool_calls=None))], created=1731482165, model='default-model', object='chat.completion', service_tier=None, system_fingerprint=None, usage=CompletionUsage(completion_tokens=22, prompt_tokens=166, total_tokens=188, completion_tokens_details=None, prompt_tokens_details=None))
get_current_weather
{"location": "Shanghai, China", "unit": "celsius"}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

glm-4-9b-chat量化bin模型(精度q8_0)调用function calling失败 #345

glm-4-9b-chat量化bin模型(精度q8_0)调用function calling失败 #345

rustjiao commented Sep 10, 2024 •

edited

Loading

zutuanwang commented Nov 13, 2024

glm-4-9b-chat量化bin模型(精度q8_0)调用function calling失败 #345

glm-4-9b-chat量化bin模型(精度q8_0)调用function calling失败 #345

Comments

rustjiao commented Sep 10, 2024 • edited Loading

zutuanwang commented Nov 13, 2024

rustjiao commented Sep 10, 2024 •

edited

Loading