Local model via llama-cpp-python support #72

luzik · 2024-01-03T10:37:40Z

As llama.cpp is now best backend for opensource models, and llama-cpp-python (used as python software backend for python powered GUIs) have buildin OpenAI API support with function (tools) calling support.

https://llama-cpp-python.readthedocs.io/en/latest/server/#function-calling
https://github.com/abetlen/llama-cpp-python#function-calling

and there are docker support of this tool, I wanted to get support with running this things all together

I have read #17 but that is mostly about LocalAI. LocalAI is using llama-cpp-python as backend, so why not to go shortcut and use llama-cpp-python directly ?

My docker-compose looks like this (with llama-cpp-python git cloned, if you do not need GPU support just use commented #image instead of build:)

version: '3.4'
services:
  llama-cpp-python:
    container_name: llama-cpp-python
    #image: ghcr.io/abetlen/llama-cpp-python:latest
    build: llama-cpp-python/docker/cuda_simple  # docker-compose build --no-cache
    environment:
      #- MODEL=/models/sha256:6ae28029995007a3ee8d0b8556d50f3b59b831074cf19c84de87acf51fb54054
      #- MODEL=/models/openchat_3.5-16k.Q4_K_M.gguf
      #- MODEL=/models/zephyr-7b-beta.Q5_K_M.gguf
      #- MODEL=/models/starling-lm-7b-alpha.Q5_K_M.gguf
      #- MODEL=/models/wizardcoder-python-13b-v1.0.Q4_K_M.gguf
      #- MODEL=/models/deepseek-coder-6.7b-instruct.Q5_K_M.gguf
      #- MODEL=/models/mixtral-8x7b-instruct-v0.1.Q4_K_M.gguf
      #- MODEL=/models/phi-2.Q5_K_M.gguf
      - MODEL=/models/functionary-7b-v1.Q4_K_S.gguf
      - USE_MLOCK=0
    ports:
      - 8008:8000
    volumes:
      - ./models:/models
    restart: on-failure:0
    cap_add:
      - SYS_RESOURCE
    deploy:
        resources:
          reservations:
            devices:
              - driver: nvidia
                device_ids: ['0']
                capabilities: [gpu]
    command: python3 -m llama_cpp.server --n_gpu_layers 33 --n_ctx 18192 --chat_format functionary

But I've got answers like:

turn on "wyspa" light
Something went wrong: Service light.on not found.
where is paris?
Something went wrong: Service location.navigate not found.

Maybe something wrong with my prompt ?

The text was updated successfully, but these errors were encountered:

jekalmin · 2024-01-03T11:48:12Z

turn on "wyspa" light
Something went wrong: Service light.on not found.
where is paris?
Something went wrong: Service location.navigate not found.

I haven't tried llama-cpp-python yet, but the error message above happens when LLM tries to call service with light.on which should be light.turn_on.

Since I don't know much about LLM, I have no right answer.
What I can assume is the model you are using hasn't trained HA data much.
If this is the case, trying different model might help.

I will try this as well later!

Also, I want to know what prompt you have used, probably default prompt?

luzik · 2024-01-03T13:31:01Z

I think that my model do not know anything about HomeAssistant. Is there a way to provide service names with description in "tool spec"? For example about domain light with list of services ?

jekalmin · 2024-01-03T13:51:04Z

I think that my model do not know anything about HomeAssistant.

I think so.

Is there a way to provide service names with description in "tool spec"? For example about domain light with list of services ?

Maybe you can try setting enum to domain and service.

- spec:
    name: execute_services
    description: Use this function to execute service of devices in Home Assistant.
    parameters:
      type: object
      properties:
        list:
          type: array
          items:
            type: object
            properties:
              domain:
                type: string
                description: The domain of the service
                enum:
                  - light
                  - switch
              service:
                type: string
                description: The service to be called
                enum:
                  - turn_on
                  - turn_off
              service_data:
                type: object
                description: The service data object to indicate what to control.
                properties:
                  entity_id:
                    type: array
                    items:
                      type: string
                      description: The entity_id retrieved from available devices. It must start with domain, followed by dot character.
                required:
                - entity_id
            required:
            - domain
            - service
            - service_data
  function:
    type: native
    name: execute_service

luzik · 2024-01-03T22:30:16Z

Ok, after model change and those fixes I've got HA error:
Something went wrong: function ' execute_services' does not exist. This is because of extra space ?

My debug shows:

llama-cpp-python    | user:
llama-cpp-python    | </s>turn on kuchnia light</s>
llama-cpp-python    | assistant execute_services:
llama-cpp-python    |
llama-cpp-python    | {
llama-cpp-python    |   "list": [
llama-cpp-python    |     {
llama-cpp-python    |       "domain": "light",
llama-cpp-python    |       "service": "turn_on",
llama-cpp-python    |       "service_data": {
llama-cpp-python    |         "entity_id": "light.kuchnia"
llama-cpp-python    |       }
llama-cpp-python    |     }
llama-cpp-python    |   ]
llama-cpp-python    | }

Maybe we can trim extra chars from function names ?

jekalmin · 2024-01-04T04:11:00Z

Something went wrong: function ' execute_services' does not exist. This is because of extra space ?

I think so.

Maybe we can trim extra chars from function names ?

Without modification of code, it's not possible.
However, even if it works, it would not be satisfactory if the model hasn't trained HA data.

Since providing enums in spec is just a workaround, it would result in problems after problems.
Maybe looking for a model that has trained HA data or a way to fine tune model would be the better approach.

luzik · 2024-01-04T06:40:22Z

I've changed model and now it do not need enum anymore.

luzik · 2024-01-04T07:55:59Z

Maybe I can try to fix this trim issue by myself, can you help me finding right place to start within you code ?

jekalmin · 2024-01-04T12:07:04Z

I'm not certain where to put, but this is the place that compares function names.

luzik · 2024-01-04T12:22:15Z

Thanks! But have to dig more.. any clues on those logs ?

homeassistant             | 2024-01-04 13:16:30.163 INFO (MainThread) [custom_components.extended_openai_conversation] Response {
homeassistant             |   "choices": [
homeassistant             |     {
homeassistant             |       "finish_reason": "tool_calls",
homeassistant             |       "index": 0,
homeassistant             |       "message": {
homeassistant             |         "content": null,
homeassistant             |         "function_call": {
homeassistant             |           "arguments": "{\n  \"list\": [\n    {\n      \"domain\": \"light\",\n      \"service\": \"turn_on\",\n      \"service_data\": {\n        \"entity_id\": \"light.kanapa\"\n      }\n    }\n  ]\n}      \n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n",
homeassistant             |           "name": ": execute_services"
homeassistant             |         },
homeassistant             |         "role": "assistant",
homeassistant             |         "tool_calls": [
homeassistant             |           {
homeassistant             |             "function": {
homeassistant             |               "arguments": "{\n  \"list\": [\n    {\n      \"domain\": \"light\",\n      \"service\": \"turn_on\",\n      \"service_data\": {\n        \"entity_id\": \"light.kanapa\"\n      }\n    }\n  ]\n}      \n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n",
homeassistant             |               "name": ": execute_services"
homeassistant             |             },
homeassistant             |             "id": ": execute_services",
homeassistant             |             "type": "function"
homeassistant             |           }
homeassistant             |         ]
homeassistant             |       }
homeassistant             |     }
homeassistant             |   ],
homeassistant             |   "created": 1704370580,
homeassistant             |   "id": "XXXX",
homeassistant             |   "model": "gpt-3.5-turbo",
homeassistant             |   "object": "chat.completion",
homeassistant             |   "usage": {
homeassistant             |     "completion_tokens": 150,
homeassistant             |     "prompt_tokens": 1858,
homeassistant             |     "total_tokens": 2008
homeassistant             |   }
homeassistant             | }
homeassistant             | 2024-01-04 13:16:30.166 ERROR (MainThread) [custom_components.extended_openai_conversation] native function 'execute_services' does not exist
homeassistant             | Traceback (most recent call last):
homeassistant             |   File "/config/custom_components/extended_openai_conversation/__init__.py", line 179, in async_process
homeassistant             |     response = await self.query(user_input, messages, exposed_entities, 0)
homeassistant             |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
homeassistant             |   File "/config/custom_components/extended_openai_conversation/__init__.py", line 316, in query
homeassistant             |     message = await self.execute_function_call(
homeassistant             |               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
homeassistant             |   File "/config/custom_components/extended_openai_conversation/__init__.py", line 361, in execute_function
homeassistant             |     result = await function_executor.execute(
homeassistant             |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
homeassistant             |   File "/config/custom_components/extended_openai_conversation/helpers.py", line 228, in execute
homeassistant             |     raise NativeNotFound(name)
homeassistant             | custom_components.extended_openai_conversation.exceptions.NativeNotFound: native function 'execute_services' does not exist

jekalmin · 2024-01-04T13:20:03Z

Maybe this is execute_services in your config, which should be execute_service?

luzik · 2024-01-04T13:38:06Z

Yeah I was think that this was an error and changed this to execute_serices, thanks!

Now with extra
message["function_call"]["name"] = message["function_call"]["name"].strip(' :')

function calling is working ok, but after calling there is response:
homeassistant | 2024-01-04 14:21:24.395 INFO (MainThread) [custom_components.extended_openai_conversation] Prompt for gpt-3.5-turbo: [{'role': 'system', 'content': "You[.....]tion you need."}, {'role': 'user', 'content': 'turn wyspa off'}, {'role': 'function', 'name': 'execute_services', 'content': '[True]'}]
This is some kind of confirm ?
Because this gives

homeassistant             | Traceback (most recent call last):
homeassistant             |   File "/config/custom_components/extended_openai_conversation/__init__.py", line 179, in async_process
homeassistant             |     response = await self.query(user_input, messages, exposed_entities, 0)
homeassistant             |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
homeassistant             |   File "/config/custom_components/extended_openai_conversation/__init__.py", line 316, in query
homeassistant             |     message = await self.execute_function_call(
homeassistant             |               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
homeassistant             |   File "/config/custom_components/extended_openai_conversation/__init__.py", line 377, in execute_function
homeassistant             |     return await self.query(user_input, messages, exposed_entities, n_requests)
homeassistant             |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
homeassistant             |   File "/config/custom_components/extended_openai_conversation/__init__.py", line 316, in query
homeassistant             |     message = await self.execute_function_call(
homeassistant             |                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^
homeassistant             |   File "/config/custom_components/extended_openai_conversation/__init__.py", line 344, in execute_function_call
homeassistant             |     raise FunctionNotFound(message["function_call"]["name"])
homeassistant             | custom_components.extended_openai_conversation.exceptions.FunctionNotFound: function 'none' doe
```s not exist

jekalmin · 2024-01-04T14:01:56Z

After function is called, it makes another request to LLM to get response message.
However it seems that this model tries to call another function named "none" which doesn't exist.

Probably it's not aware that function call succeeded even though we resulted in 'content': '[True]'
Maybe you can try to respond 'content': '{success: True'} like here.

luzik · 2024-01-05T11:38:42Z

This did not help, but:

Why do we need to inform model about success ? (and wasting tokens)
I reported miss compatibility of function calling with new model I am using and I believe that this will be fixed soon functionary-7b-v1 model upgrade abetlen/llama-cpp-python#1061 .

jekalmin · 2024-01-07T15:44:41Z

Why do we need to inform model about success ? (and wasting tokens)

We can't get response message and function call at the same time.
We should either call API again to get response message or give up response message.

OperKH · 2024-01-29T17:09:28Z

I'm using LocalAI and this integration works with models functionary-7b-v1.4 and luna-ai-llama2-uncensored but with some models e.g. mistral-7b-openorca I got error:

function 'None' does not exist
Traceback (most recent call last):
  File "/config/custom_components/extended_openai_conversation/__init__.py", line 187, in async_process
    response = await self.query(user_input, messages, exposed_entities, 0)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/config/custom_components/extended_openai_conversation/__init__.py", line 312, in query
    message = await self.execute_function_call(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/config/custom_components/extended_openai_conversation/__init__.py", line 339, in execute_function_call
    raise FunctionNotFound(function_name)
custom_components.extended_openai_conversation.exceptions.FunctionNotFound: function 'None' does not exist

It is model problem or API problem and how it can be fixed?

jekalmin · 2024-02-04T14:02:00Z

Maybe you can try dolphin-2.7-mixtral-8x7b as Anto mentioned.

Since I haven't tried LocalAI much, I also need to try those.
(I failed to get it work)

Anto79-ops · 2024-02-04T17:49:50Z

@OperKH yes, this model works https://huggingface.co/TheBloke/dolphin-2.7-mixtral-8x7b-GGUF BUT it cannot perform fucntion services.

Have you tried the functionary v2 model? I cannot get a template for the model to work with LocalAI. Supposedely this handles functions/tool better:

mudler/LocalAI#1641

neowisard · 2024-02-06T11:16:04Z

I'm using LocalAI and this integration works with models functionary-7b-v1.4 and luna-ai-llama2-uncensored but with some models e.g. mistral-7b-openorca I got error:

@OperKH did you get to do any of the functions \ tools? Or was it just communication/answers?

Anto79-ops · 2024-02-06T13:06:27Z

It does function only. Does not chat well, if at all if I remember correctly.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Local model via llama-cpp-python support #72

Local model via llama-cpp-python support #72

luzik commented Jan 3, 2024

jekalmin commented Jan 3, 2024

luzik commented Jan 3, 2024

jekalmin commented Jan 3, 2024

luzik commented Jan 3, 2024

jekalmin commented Jan 4, 2024

luzik commented Jan 4, 2024

luzik commented Jan 4, 2024

jekalmin commented Jan 4, 2024

luzik commented Jan 4, 2024

jekalmin commented Jan 4, 2024

luzik commented Jan 4, 2024

jekalmin commented Jan 4, 2024 •

edited

Loading

luzik commented Jan 5, 2024

jekalmin commented Jan 7, 2024

OperKH commented Jan 29, 2024

jekalmin commented Feb 4, 2024

Anto79-ops commented Feb 4, 2024

neowisard commented Feb 6, 2024 •

edited

Loading

Anto79-ops commented Feb 6, 2024

Local model via llama-cpp-python support #72

Local model via llama-cpp-python support #72

Comments

luzik commented Jan 3, 2024

jekalmin commented Jan 3, 2024

luzik commented Jan 3, 2024

jekalmin commented Jan 3, 2024

luzik commented Jan 3, 2024

jekalmin commented Jan 4, 2024

luzik commented Jan 4, 2024

luzik commented Jan 4, 2024

jekalmin commented Jan 4, 2024

luzik commented Jan 4, 2024

jekalmin commented Jan 4, 2024

luzik commented Jan 4, 2024

jekalmin commented Jan 4, 2024 • edited Loading

luzik commented Jan 5, 2024

jekalmin commented Jan 7, 2024

OperKH commented Jan 29, 2024

jekalmin commented Feb 4, 2024

Anto79-ops commented Feb 4, 2024

neowisard commented Feb 6, 2024 • edited Loading

Anto79-ops commented Feb 6, 2024

jekalmin commented Jan 4, 2024 •

edited

Loading

neowisard commented Feb 6, 2024 •

edited

Loading