Replies: 2 comments 1 reply
-
This is my assumption on it. llama-cpp-python's function call seems to force LLM to output json format and them parse it.
The LLM is appropriately trying to obey the rule, but this intermediate state is still not valid as function calling nor normal texts. What we can do to avoid it is to auto-complete the LLM's outputs (e.g. adding closure sequence ""\n}" to the above example) |
Beta Was this translation helpful? Give feedback.
-
@samuelint Are you still interested in giving this a go? We ran into this limitation with RAGLite as well. Streaming |
Beta Was this translation helpful? Give feedback.
-
I would like to use functions calling & stream=True
But I have error
Automatic streaming tool choice is not supported
From here:
llama-cpp-python/llama_cpp/llama_chat_format.py
Line 3751 in 816d491
What need to be done to support "auto" tool choice with streaming?
Why it was not implemented at first? Why is it not supported? @abetlen
I can try to implement it, be would be useful to have history of the feature, so if it's not possible I'll not lose my time.
Thanks !
Beta Was this translation helpful? Give feedback.
All reactions