You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
asyncfunctioninvokeMessage(...){try{// custom_response is any function with fetching data to llama.cpp API// result of `fetch('.../v1/threads/runs', ...)`constresponse=custom_response;constreader=response?.body?.getReader();constdecoder=newTextDecoder();letdone=false;letpartialData="";assistantMessageRef.current="";constprevMessages: Message[]=getMessages();while(!done){const{ value,done: doneReading}=(awaitreader?.read())||{};done=!!doneReading;constchunk=decoder.decode(value);constlines=(partialData+chunk).split("\n").filter((line)=>line.trim()!=="");partialData="";for(constlineoflines){if(line.startsWith("data: ")){constjson=line.replace("data: ","");if(json==="[DONE]"){setError(null);setLoading(false);if(onDone){onDone();}return;}partialData+=json;try{constparsedData=JSON.parse(partialData);partialData="";if(parsedData?.run_id||parsedData?.thread_id){construnId=parsedData.run_id;constthreadId=parsedData.thread_id;setRunId?.(runId,threadId);// Сохраняем run_id}constdeltaContent=parsedData?.delta?.content;if(deltaContent&&deltaContent.length>0){for(constcontentofdeltaContent){consttoken=content?.text?.value||"";if(!token&&!assistantMessageRef.current){continue;}assistantMessageRef.current+=token;constlastMessage=prevMessages[prevMessages.length-1];if(lastMessage?.role==="assistant"&&!lastMessage?._type){lastMessage.content=assistantMessageRef.current;setMessages([...prevMessages.slice(0,-1),lastMessage]);}setMessages([
...prevMessages,{role: "assistant",content: assistantMessageRef.current},]);}}}catch(error: any){if(errorinstanceofSyntaxError){console.warn("Partial JSON data, waiting for more chunks...");}else{console.error("Parsing error:",error);partialData="";}}}}}}catch(error: any){if(error.name!=="AbortError"){console.error("Error communicating with OpenAI:",error);setError(newError(error).message);}setLoading(false);throwerror;}}
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hello. I am currently developing a VS Code extension that interacts with LLMs via API requests to:
https://llama-cpp-python.readthedocs.io/en/latest/server/
I have noticed that there is no
/v1/threads
support in the llama.cpp server.To address this, I have created a simple express.js server to handle that route,
and I would like to share it with you as it is (it's not a perfect).
Perhaps someone can add this route to
llama-cpp[server]
based on this information.I have not found any information about the
/v1/threads
route.This code consists of:
/v1/threads/runs
to/v1/chat/completions
and vice versa.thread_id
andrun_id
valuesYou can help me with:
I use something like that to communicate with
Beta Was this translation helpful? Give feedback.
All reactions