Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify worker runner #16

Closed
wants to merge 1 commit into from
Closed

Conversation

djjuhasz
Copy link
Contributor

Use temporalsdk_worker.Run() blocking call with a system interrupt listening channel to simplify the workercmd and main (cli) code.

Copy link

codecov bot commented Apr 24, 2024

Codecov Report

Attention: Patch coverage is 0% with 6 lines in your changes are missing coverage. Please review.

Project coverage is 32.27%. Comparing base (ec3de02) to head (d70dd61).
Report is 2 commits behind head on main.

❗ Current head d70dd61 differs from pull request most recent head 36b74d9. Consider uploading reports for the commit 36b74d9 to get more accurate results

Files Patch % Lines
cmd/worker/workercmd/cmd.go 0.00% 5 Missing ⚠️
cmd/worker/main.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #16      +/-   ##
==========================================
+ Coverage   29.31%   32.27%   +2.96%     
==========================================
  Files           6        6              
  Lines         174      158      -16     
==========================================
  Hits           51       51              
+ Misses        118      102      -16     
  Partials        5        5              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Use `temporalsdk_worker.Run()` blocking call with a system interrupt
listening channel to simplify the `workercmd` and `main` (cli) code.

[skip-codecov]
@djjuhasz djjuhasz force-pushed the dev/simplify-worker-runner branch from d70dd61 to 36b74d9 Compare April 24, 2024 19:30
Copy link

@sevein sevein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note that the current model is designed to make Run non-blocking to facilitate the addition of future tasks that we will have to execute in parallel, like a health check API, a pprof server, etc (inspired by wtfd).
If Run were to become a blocking function managing multiple tasks we would need to adapt it to synchronize multiple goroutines, e.g. legacy enduro uses github.com/oklog/run specifically for that, but at the end of the day I think we would end up with the same type of complexity just handled elsewhere. IMO the complexity is justified unless you're rather get rid of it until strictly necessary.

@djjuhasz
Copy link
Contributor Author

djjuhasz commented Apr 24, 2024

@sevein the context for this change is I'm trying to get the project test coverage above the sad ~49% mark where it's currently sitting on main (https://app.codecov.io/gh/artefactual-sdps/preprocessing-moma/commit/cc1b1bb85831b1e6ad1d537673aba249adb7797c).

There are two files that are currently at zero coverage, which are dragging the average coverage way down:

  1. internal/version/version.go (22 lines)
  2. cmd/main.go (44 lines)

Between the two of them they contribute 66 of 82 (80%) of the missed lines in the project.

I've managed to add a bit of test coverage to "version.go" in PR #15, but a lot of the magic debug.BuildInfo() VCS data is not testable right now, as far as I can tell (See the comment in "version_test.go").

Which brings me to "cmd/main.go". I did a fair bit of Googling today on how to test a Go main() function, and the general consensus seems to be don't do that, at least not in unit tests. Accepting this community wisdom means we've got 44 lines of code (out of 185 in the project) in "cmd/main.go" that are currently untestable. The same folks that say "don't test main()!" also say "keep main() as small as possible, and do all the work in other parts of the code that you can test".

So this PR was a first attempt at reducing the complexity of main() so we have less untested code. I thought that switching to the blocking worker.Run() was a good option because:

  1. We aren't running any other goroutines right now in main or workercmd. The current code (as I understand it) sets up a context that cancels when as system interrupt is receives, starts workercmd.Run() in a goroutine using that context, and then immediately blocks until the context cancels. This seems like unnecessary complexity in service to creating what is effectively a blocking call of workercmd.Run().
  2. You have told me in the past that it's better to use synchronous code when you can because it's less complicated, usually it's good enough, and that this is becoming more accepted in the Go community. Right now I don't think we need asynchronous code here.
  3. We can't test this code.

I'm open to the fact that we may want to run other goroutines when we start the worker, though in general I don't like the pattern in enduro-sdps of starting multiple services from one entry point. In any case, we aren't starting any other parallel tasks in the worker right now, so why include the extra complexity in the code? If we do want to start other services when we start the worker, I think it would be better to do that in workercmd or somewhere else where we can test the code.

@djjuhasz
Copy link
Contributor Author

djjuhasz commented Apr 24, 2024

@sevein all that said, this PR broke the integration test, so I need to rethink it anyway. 🤔

@sevein
Copy link

sevein commented Apr 25, 2024

Thank you for the added context. Maybe in a larger codebase (we're tracking 174 lines of code today), a 44-line main function wouldn't be such a big deal, but I'm supportive with adopting a new pattern with a thinner main too if we want to go down that route. There are really cool examples out there, from cockroach's 1-line main() to something like tailscaled that seems to deal mostly with arguments and env strings.

I recommend reevaluating the decision to omit passing the context from main() to other functions. Mat Ryer describes it well here. It'll make testing easier, which I think you're about to discover when you try to fix the integration test error reported in CI.
I would even argue that Mat's run() should not use os.Interrupt, that's also fundamental to me, a global that I'd rather pass from main. A lot of people set up the signal handler in main(). acorn does it beatifully via the k8s's SetupSignalHandler. A problem with temporalsdk_worker.InterruptCh() is that it's doing it implicitly which is cool for code examples but I'm not sure that it scales well.

I agree that enduro-sdps needs to be fixed, that I believe was the result of forking old legacy-enduro code. However the claim "we can't test main()" is oudated since go1.20 added support for building coverage-instrumented binaries (go build -cover) and I've seen projects going down that route instead, e.g. authelia. But it seems like it would add more complexity to the build system which I'd rather avoid. It's probably worth the effort for existing large projects where refactoring main() would be a big risk.

@djjuhasz
Copy link
Contributor Author

@sevein thank you for your thoughtful feedback, and for all the great links to main() examples and the Mat Ryer article. I'm going to park this PR for now and work on other things. I would like to come back to this later, but I may end up just closing the PR if I can't justify the time investment.

@djjuhasz
Copy link
Contributor Author

This PR needs a significant rework so I'm going to close it, and come back to it later if time permits.

@djjuhasz djjuhasz closed this Apr 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants