Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server: maintain chat completion id for streaming responses #5880

Conversation

mscheong01
Copy link
Collaborator

fixes #5876

tested code ( provided by @xyc )

import OpenAI from "openai";

process.env["OPENAI_API_KEY"] =
  "no-key";

const openai = new OpenAI({
  baseURL: "http://127.0.0.1:8080/v1",
  apiKey: "no-key",
});

async function main() {
  const stream = await openai.chat.completions.create({
    model: "gpt-3.5-turbo",
    messages: [{ role: "user", content: "Say this is a test" }],
    stream: true,
  });
  for await (const chunk of stream) {
    process.stdout.write(JSON.stringify(chunk));
  }
}

main();

result (before)

{"choices":[{"delta":{"content":"S"},"finish_reason":null,"index":0}],"created":1709628034,"id":"chatcmpl-LMLcrWfJ29GgNvzP7tYhSSC32xARD6p1","model":"gpt-3.5-turbo","object":"chat.completion.chunk"}{"choices":[{"delta":{"content":"ay"},"finish_reason":null,"index":0}],"created":1709628034,"id":"chatcmpl-B5x5NUWQhvUCrh4oPQkigOfFvgdEe6kM","model":"gpt-3.5-turbo","object":"chat.completion.chunk"}{"choices":[{"delta":{"content":" this"},"finish_reason":null,"index":0}],"created":1709628046,"id":"chatcmpl-34ohfUOtEixbtgGVJY9CIkPFZ1BCwC3o","model":"gpt-3.5-turbo","object":"chat.completion.chunk"}{"choices":[{"delta":{"content":" is"},"finish_reason":null,"index":0}],"created":1709628048,"id":"chatcmpl-mk5o2qKT46tvoShPZ2LNr3RjiDupMIGt","model":"gpt-3.5-turbo","object":"chat.completion.chunk"}{"choices":[{"delta":{"content":" a"},"finish_reason":null,"index":0}],"created":1709628049,"id":"chatcmpl-6RXeg13CV7h1jhDDsbTUfoy0nysh67iq","model":"gpt-3.5-turbo","object":"chat.completion.chunk"}{"choices":[{"delta":{"content":" test"},"finish_reason":null,"index":0}],"created":1709628050,"id":"chatcmpl-hnlmLdKAvr0YV0eXUcIfc6uLYDdHkKXV","model":"gpt-3.5-turbo","object":"chat.completion.chunk"}{"choices":[{"delta":{},"finish_reason":"stop","index":0}],"created":1709628073,"id":"chatcmpl-zTcUBKVba00yAd0PylbJJFEcbqJr5RXs","model":"gpt-3.5-turbo","object":"chat.completion.chunk"}

result (after)

{"choices":[{"delta":{"content":"S"},"finish_reason":null,"index":0}],"created":1709627698,"id":"chatcmpl-OvKzfgQykk3jjHM2liQCrwnCwzJWKUAk","model":"gpt-3.5-turbo","object":"chat.completion.chunk"}{"choices":[{"delta":{"content":"ay"},"finish_reason":null,"index":0}],"created":1709627706,"id":"chatcmpl-OvKzfgQykk3jjHM2liQCrwnCwzJWKUAk","model":"gpt-3.5-turbo","object":"chat.completion.chunk"}{"choices":[{"delta":{"content":" this"},"finish_reason":null,"index":0}],"created":1709627707,"id":"chatcmpl-OvKzfgQykk3jjHM2liQCrwnCwzJWKUAk","model":"gpt-3.5-turbo","object":"chat.completion.chunk"}{"choices":[{"delta":{"content":" is"},"finish_reason":null,"index":0}],"created":1709627708,"id":"chatcmpl-OvKzfgQykk3jjHM2liQCrwnCwzJWKUAk","model":"gpt-3.5-turbo","object":"chat.completion.chunk"}{"choices":[{"delta":{"content":" a"},"finish_reason":null,"index":0}],"created":1709627709,"id":"chatcmpl-OvKzfgQykk3jjHM2liQCrwnCwzJWKUAk","model":"gpt-3.5-turbo","object":"chat.completion.chunk"}{"choices":[{"delta":{"content":" test"},"finish_reason":null,"index":0}],"created":1709627710,"id":"chatcmpl-OvKzfgQykk3jjHM2liQCrwnCwzJWKUAk","model":"gpt-3.5-turbo","object":"chat.completion.chunk"}{"choices":[{"delta":{},"finish_reason":"stop","index":0}],"created":1709627727,"id":"chatcmpl-OvKzfgQykk3jjHM2liQCrwnCwzJWKUAk","model":"gpt-3.5-turbo","object":"chat.completion.chunk"}

Copy link
Collaborator

@ngxson ngxson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is LGTM, one thing that would be nice if you can do:

We're having multiple places calling gen_chatcmplid. It would be nice to call gen_chatcmplid only once for each incoming request, then use the generated id in both format_final_response_oaicompat and format_partial_response_oaicompat

Just small detail though, we can do it later. I'm planning to clean up all the functions related to OAI-compat.

Copy link
Collaborator

@ngxson ngxson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM, thank you. I'll wait for @ggerganov to decide if this can be merged now or after #5882

@ggerganov
Copy link
Owner

Thanks for taking a look. Will be reimplemented in #5882 - no need to merge this PR

@xyc
Copy link
Contributor

xyc commented Mar 6, 2024

Thank you for the quick turnaround, @mscheong01 @ggerganov .

I don't have much to add - it works fine and generates consistent chat completion id. Only issue is when consumed by OpenAI client (as in the code second block of #5876 (comment)) it seems that I would still get an error (missing role for choice 0)

Adding {"role", "assistant"}, before these lines seems to fix it.

{"content", content}}}

{"content", content},

@ggerganov
Copy link
Owner

@mscheong01 If you can re-apply the changes on top of latest master we can merge this. Will close this for now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consistent chat completion id in OpenAI compatible chat completion endpoint
4 participants