-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix streaming handling for builtin assistants #462
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems sensible to me.
Nitpick: Could you add a unit test to prevent this regression in the future?
I thought about it, but I honestly have trouble coming up with one that is not unreasonably complex. We need a server that we can hit for HTTP streaming. And the best option we have here is to start our own. I'll see if we can have this in a reasonable way. |
@@ -11,12 +10,7 @@ | |||
from ragna._utils import timeout_after | |||
from ragna.deploy import Config | |||
from tests.deploy.utils import TestAssistant | |||
|
|||
|
|||
def get_available_port(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved it to the generic test utils as this is no longer just needed for deploy.
|
||
|
||
@contextlib.contextmanager | ||
def background_subprocess(*args, stdout=sys.stdout, stderr=sys.stdout, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had this before, but it was removed in #322. Probably can also be used by the UI tests, but we can do that in a follow-up.
This reverts commit 0b9211e.
@pytest.mark.parametrize("streaming_protocol", list(HttpStreamingProtocol)) | ||
def test_http_streaming_termination(streaming_server, streaming_protocol): | ||
# Non-regression test for https://github.com/Quansight/ragna/pull/462 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My patch in #425 is broken when using the Python API with
asyncio.run
, i.e. the regular use case. As implemented in #425,break
ing out of the loop in.answer()
means that the async iterator is not fully exhausted, i.e.AsyncStopIteration
is never raised. This leads to the context manager that handles the HTTP request staying open longer thanChat.answer
. When it is cleaned up eventually, internally it callsawait request.aclose()
. However, if at that point the async event loop is already killed, e.g. because the coroutine called byasyncio.run
is finished, you'll get an error:To fix this, this PR introduces the same pattern for our streaming handling that
httpx
andhttpx_sse
use as well: a context manager creates the stream and the stream can than be iterated on.ragna/ragna/assistants/_http_api.py
Lines 83 to 89 in 7c5728a
ragna/ragna/assistants/_http_api.py
Lines 99 to 103 in 7c5728a
Basically, we switch from
to
With this pattern, the cleanup is correctly handled when we exit
.answer()
and thus get rid of the error.