-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ChatLlamaCpp stream causes infinite loop #7618
Comments
I found a similar discussion that might be relevant to your issue: Llama.cpp - confusing and infinitely long answers. This discussion addresses a similar problem with the To address this, you might consider implementing a manual timeout or abort mechanism using an To continue talking to Dosu, mention @dosu. Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other |
@dosu But I don't want to use an AbortController since I want a concrete and finalized response. There must be a way to make it work because with invoke, it responds correctly, but with stream, it seems to go crazy and never finishes. |
The Additionally, you can use the If the To continue talking to Dosu, mention @dosu. Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other |
How could I integrate onToken in an compiled workflow with StateGraph @dosu import {
StateGraph,
END,
START,
MessagesAnnotation,
MemorySaver,
} from "@langchain/langgraph";
import { ChatPromptTemplate } from "@langchain/core/prompts";
import { ChatLlamaCpp } from "@langchain/community/chat_models/llama_cpp";
let model = null;
export let modelPath = null;
export const loadModel = async (path) => {
modelPath = path;
model = await ChatLlamaCpp.initialize({
modelPath: modelPath,
contextSize: 1024,
});
};
const promptTemplate = ChatPromptTemplate.fromMessages([
["system", "You are a helpful assistant. You can use markdown."],
["placeholder", "{messages}"],
]);
const callModel = async (state) => {
try {
if (model === null) {
throw new Error("Model not loaded");
}
if (model._context.sequencesLeft === 0) {
model._context = await model._model.createContext({ contextSize: 1024 });
}
const prompt = await promptTemplate.invoke(state);
const response = await model.invoke(prompt);
return { messages: [response] };
} catch (error) {
return { messages: [{ type: "system", text: error.toString() }] };
}
};
const workflow = new StateGraph(MessagesAnnotation)
.addNode("model", callModel)
.addEdge(START, "model")
.addEdge("model", END);
export const app = workflow.compile({ checkpointer: new MemorySaver() }); |
To integrate the
Here's an example of how you can modify your code: const callModel = async (state) => {
try {
if (model === null) {
throw new Error("Model not loaded");
}
if (model._context.sequencesLeft === 0) {
model._context = await model._model.createContext({ contextSize: 1024 });
}
const prompt = await promptTemplate.invoke(state);
// Define the onToken function
const onToken = (tokens) => {
console.log("Received tokens:", tokens);
// Add any custom logic for handling tokens here
};
// Use the stream method with onToken
const stream = await model.stream(prompt, { onToken });
const chunks = [];
for await (const chunk of stream) {
chunks.push(chunk.content);
}
return { messages: [{ type: "system", text: chunks.join("") }] };
} catch (error) {
return { messages: [{ type: "system", text: error.toString() }] };
}
}; In this example, the To continue talking to Dosu, mention @dosu. Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other |
Checked other resources
Example Code
Error Message and Stack Trace (if applicable)
PS C:\Users\adri2\Documents\projects\agedap-llamatron2> node .\electron\api\test.mjs
ggml_vulkan: Found 1 Vulkan devices:
Vulkan0: AMD Radeon RX 5700 XT (AMD proprietary driver) | uma: 0 | fp16: 1 | warp size: 64
[
INST
]
I
'm
a
computer
program
.
I
'm
here
to
help
you
with
your
questions
.
What
can
I
help
you
with
today
?
[/
INST
]
[
INST
]
You
can
ask
me
anything
,
from
general
knowledge
to
specific
topics
like
science
,
history
,
or
entertainment
.
I
'll
do
my
best
to
provide
you
with
accurate
and
helpful
information
.
[/
INST
]
[
INST
]
Go
ahead
and
ask
me
your
question
,
or
type
'
help
'
if
you
need
assistance
with
how
to
interact
with
me
.
[/
INST
]
[
INST
]
I
'm
ready
when
you
are
!
[/
INST
]
[
INST
]
Description
System Info
Node v21
Windows 11
AMD 5700 XT
The text was updated successfully, but these errors were encountered: