I am using Flasksocketio for emit( 'response', chunks, room=user_id) but it emit the final output. #616

faridelya · 2023-08-15T21:08:55Z

faridelya
Aug 15, 2023

Hi @everyone hope you all doing well.

i am using this in flasksocketio and the code is

@socketio.on('chat_message')
def handle_message(data):
   
    system = data.get('system', '')
    user_prompt = data.get('prompt', '')
    print(user_prompt)
    user_id = data.get("room", "")
    if system != "":
        prompt = prompt_template(system=system , prompt= user_prompt)
    else:
        prompt = prompt_template(prompt = user_prompt)
    stream = data.get('stream', False)
    if stream ==False:
        stream = False
    else:
        print("stream==", stream)
        stream = stream
    max_token = data.get('max_token',4096 )
    if max_token<=4096:
        max_token = max_token
    else:
        max_token = max_token
    try:
        response = lcpp_llm(
                prompt=prompt,
                max_tokens=max_token,
                temperature=0.5,
                top_p=0.95,
                repeat_penalty=1.2,
                top_k=50,
                echo=False,
                stream=stream)  # it got True
        if stream==False:
            emit('response', response["choices"][0]["text"], broadcast=True)
        for i in response:
            chunks = i["choices"][0]["text"]
            print("==>",chunks)
            emit('response', chunks ,room=user_id)

    except Exception as e:
        error_message = f"error: {str(e)}"
        emit('response', error_message, room=user_id)

the problem is the stream response chunks print i can see but the response chunks did not emit until all response complete why??

Terminal response

write two line of best quote
stream== True
Llama.generate: prefix-match hit
==>  Sure
==> !
==>  Here
==>  are
==>  two
==>  lines
==>  of
==>  insp
==> iring
==>  quotes
==>  for
==>  you
==> :
==> 

==> 

==> "
==> The
==>  future
==>  belongs
==>  to
==>  those
==>  who
==>  believe
==>  in
==>  the
==>  beauty
==>  of
==>  their
==>  dream
==> s
==> ."
==>  -
==>  Ele
==> an
==> or
==>  Ro
==> ose
==> vel
==> t
==> 

==> 

==> "
==> Bel
==> ieve
==>  you
==>  can
==>  and
==>  you
==> '
==> re
==>  half
==> way
==>  there
==> ."
==>  -
==>  The
==> odore
==>  Ro
==> ose
==> vel

llama_print_timings:        load time =   907.76 ms
llama_print_timings:      sample time =    35.10 ms /    62 runs   (    0.57 ms per token,  1766.38 tokens per second)
llama_print_timings: prompt eval time =   546.49 ms /    14 tokens (   39.03 ms per token,    25.62 tokens per second)
llama_print_timings:        eval time =  3314.18 ms /    61 runs   (   54.33 ms per token,    18.41 tokens per second)
llama_print_timings:       total time =  4036.60 ms

after this print i got all data on front end why it did not work like chatgpt api do ?
i use the same for loop for chatgpt api and can emit each chunks in real time streaming but i got problem with this??

Any Help would be appreciated

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

I am using Flasksocketio for emit( 'response', chunks, room=user_id) but it emit the final output. #616

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

I am using Flasksocketio for emit( 'response', chunks, room=user_id) but it emit the final output. #616

faridelya Aug 15, 2023

Terminal response

Any Help would be appreciated

Replies: 0 comments

faridelya
Aug 15, 2023