Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Exit on communication failure #429

Merged
merged 5 commits into from
May 10, 2024

Conversation

jnumainville
Copy link
Collaborator

@jnumainville jnumainville commented Apr 19, 2024

fixes #430

Often when rerunning component creation, an error containing "Stream is already completed, no further calls are allowed" comes up

Error rendering __main__.test_component
Traceback (most recent call last):
  File "/Users/josephnumainville/Documents/deephaven-core/.venv/lib/python3.10/site-packages/deephaven/ui/object_types/ElementMessageStream.py", line 159, in _render
    self._send_document_update(node, state)
  File "/Users/josephnumainville/Documents/deephaven-core/.venv/lib/python3.10/site-packages/deephaven/ui/object_types/ElementMessageStream.py", line 355, in _send_document_update
    self._connection.on_data(payload.encode(), new_objects)
  File "/Users/josephnumainville/Documents/deephaven-core/.venv/lib/python3.10/site-packages/deephaven_internal/plugin/object/__init__.py", line 47, in on_data
    self._wrapped.onData(payload, [javaify(ref) for ref in references])
RuntimeError: io.deephaven.plugin.type.ObjectCommunicationException: java.lang.IllegalStateException: Stream is already completed, no further calls are allowed
...

This doesn't need to raise an exception to the user as this is just the previous component's message stream signaling it is closed, but it does need to exit so it doesn't keep trying to hit the message stream.

@jnumainville jnumainville requested a review from mofojed April 19, 2024 19:50
@jnumainville jnumainville self-assigned this Apr 19, 2024
Comment on lines 360 to 362
if "io.deephaven.plugin.type.ObjectCommunicationException" in str(e):
# can no longer send, don't need to raise exception but do need to exit
sys.exit()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't be exiting without at least logging why we're exiting here.
Also I don't like doing a str(e) comparison, is there any more definitive way of checking the exception type?
Include a ticket with steps to reproduce the issue as well so it's clear which scenario is being fixed here. The description in the PR is somewhat vague in the steps.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a much better fix now. This is happening after on_close from the MessageStream is called, so I'm now flagging it there. There is still a possibility of throwing if the message stream closes after that check though. Not sure how likely it is. I could add the ugly try catch back (and check against the is_closed variable) too. As far as I can tell there is no better way to catch and check for the specific exception - the error message itself originates in Java and is just put into a generic RuntimeError.
Now that I've refined this, should we be logging? I know we don't want to hide errors, but this is a case where the component should already be gone and not sending anything and a log would just pollute the console.

Copy link
Member

@mofojed mofojed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, we don't want to exit all of Python in this scenario? Just clean up this widget, no?

Comment on lines 363 to 364
if self._is_closed:
sys.exit()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be clear - this is just a safeguard for a race condition if the connection closes while it's rendering, yeah? In which case add a comment to that effect. Do we need the sys.exit() and does that just exit this thread or ?

Copy link
Collaborator Author

@jnumainville jnumainville Apr 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sys.exit() closes this thread only. The sys.exit() (or at least something that closes this thread) is necessary because as of right now, the thrown exception is what shuts down this thread. If it isn't caught, the thread will keep trying to send updates. Really, sys.exit() just throws an error itself, but a silent one.

Copy link
Member

@mofojed mofojed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of questions, suggestion to release liveness_scope

@jnumainville
Copy link
Collaborator Author

To add some more context after digging deep into this, it's a bit broader than I originally noted.
This is happening any time any object is "removed" from the user's control, such as by calling a ui object with the same name (as with my example above) or when closing the panel.

The reason these specific renders are still happening are because the use_effect is never fully cleaned up (the table is changing in my example but that is besides the point, it restarts the use_effect anyways).

We could add specific use_effect cleanup, but I think that's too specific a fix. When the connection is closed, everything needs to be cleaned up, and I'm not sure of a more elegant way to do it. If we just cleaned up the use_effect calls, the objects still seem to linger and never fully go away.

@jnumainville jnumainville merged commit 0e96ef4 into deephaven:main May 10, 2024
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error message when rerunning some components
2 participants