Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rust-sdk: prevent panic on non-existing workflow #784

Merged
merged 2 commits into from
Jul 30, 2024

Conversation

valkum
Copy link
Contributor

@valkum valkum commented Jul 30, 2024

What was changed

This changes the check to warn about the workflow and allows the worker to continue polling from the queue similar to the other SDKs.

Currently, there is no good way of writing a test to make sure this isn't introduced by a refactor in the future.
Maybe with a custom tracing subscriber to check for the warn output, but that would be something for it's own PR.

Why?

Previously, the worker would panic due to the error return value of the activation handler.
Other SDKs do not panic/exit upon an unknown workflow.

Checklist

  1. Closes [Bug] Worker running which encounters an unknown workflow type will exit #558

  2. How was this tested:
    Start an empty worker with a local dev instance and start a workflow with any name. The worker should not exit.

  3. Any docs updates needed?
    Not really.

Previously, the worker would panic due to the error return value of the
activation handler.
Other SDKs do not panic/exit upon an unknown workflow.
This changes the check to warn about the workflow and allows the worker to continue polling from
the queue similar to the other SDKs.
@valkum valkum requested a review from a team as a code owner July 30, 2024 20:21
Copy link
Member

@Sushisource Sushisource left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes perfect sense, thank you!

@Sushisource Sushisource enabled auto-merge (squash) July 30, 2024 22:11
@Sushisource
Copy link
Member

@valkum As far as testing goes, you could test this by making sure that the history in an integration test contains a workflow task failure. But, we don't have to block the PR on that so I've set it to merge. You're welcome to add a test in another PR if you like.

@Sushisource Sushisource merged commit 3ec7c6b into temporalio:master Jul 30, 2024
6 checks passed
@valkum
Copy link
Contributor Author

valkum commented Jul 31, 2024

@valkum As far as testing goes, you could test this by making sure that the history in an integration test contains a workflow task failure. But, we don't have to block the PR on that so I've set it to merge. You're welcome to add a test in another PR if you like.

Hmm. Are you sure that a workflow failure is recorded in this case? I had a look through the go-sdk but couldn't find anything in that regard.

@Sushisource
Copy link
Member

@valkum As far as testing goes, you could test this by making sure that the history in an integration test contains a workflow task failure. But, we don't have to block the PR on that so I've set it to merge. You're welcome to add a test in another PR if you like.

Hmm. Are you sure that a workflow failure is recorded in this case? I had a look through the go-sdk but couldn't find anything in that regard.

Yep, that's definitely what should happen. In Go that error eventually bubbles up from here https://github.com/temporalio/sdk-go/blob/2baa60eb75c4e35cdc027d17552846807f9b9f50/internal/internal_event_handlers.go#L1426.

@valkum
Copy link
Contributor Author

valkum commented Jul 31, 2024

Following that code path up, it should be this one, right? https://github.com/temporalio/sdk-go/blob/2baa60eb75c4e35cdc027d17552846807f9b9f50/internal/internal_task_pollers.go#L402

Not sure, how much of that machinery is already implemented in the Rust SDK. Will have a look and maybe come back with a PR.

@Sushisource
Copy link
Member

Following that code path up, it should be this one, right? https://github.com/temporalio/sdk-go/blob/2baa60eb75c4e35cdc027d17552846807f9b9f50/internal/internal_task_pollers.go#L402

Not sure, how much of that machinery is already implemented in the Rust SDK. Will have a look and maybe come back with a PR.

The machinery for failing workflow tasks is there, it's just not quite hooked up to this particular spot. Probably it would mostly be a matter of just sending a failure down the completions_tx channel in the same spot where you made the change with a detail that the workflow function wasn't registered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] Worker running which encounters an unknown workflow type will exit
2 participants