Simplify worker runner #16

djjuhasz · 2024-04-24T19:19:57Z

Use temporalsdk_worker.Run() blocking call with a system interrupt listening channel to simplify the workercmd and main (cli) code.

codecov · 2024-04-24T19:22:11Z

Codecov Report

Attention: Patch coverage is 0% with 6 lines in your changes are missing coverage. Please review.

Project coverage is 32.27%. Comparing base (ec3de02) to head (d70dd61).
Report is 2 commits behind head on main.

❗ Current head d70dd61 differs from pull request most recent head 36b74d9. Consider uploading reports for the commit 36b74d9 to get more accurate results

Files	Patch %	Lines
cmd/worker/workercmd/cmd.go	0.00%	5 Missing ⚠️
cmd/worker/main.go	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #16      +/-   ##
==========================================
+ Coverage   29.31%   32.27%   +2.96%     
==========================================
  Files           6        6              
  Lines         174      158      -16     
==========================================
  Hits           51       51              
+ Misses        118      102      -16     
  Partials        5        5

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Use `temporalsdk_worker.Run()` blocking call with a system interrupt listening channel to simplify the `workercmd` and `main` (cli) code. [skip-codecov]

sevein

Please note that the current model is designed to make Run non-blocking to facilitate the addition of future tasks that we will have to execute in parallel, like a health check API, a pprof server, etc (inspired by wtfd).
If Run were to become a blocking function managing multiple tasks we would need to adapt it to synchronize multiple goroutines, e.g. legacy enduro uses github.com/oklog/run specifically for that, but at the end of the day I think we would end up with the same type of complexity just handled elsewhere. IMO the complexity is justified unless you're rather get rid of it until strictly necessary.

djjuhasz · 2024-04-24T22:33:31Z

@sevein the context for this change is I'm trying to get the project test coverage above the sad ~49% mark where it's currently sitting on main (https://app.codecov.io/gh/artefactual-sdps/preprocessing-moma/commit/cc1b1bb85831b1e6ad1d537673aba249adb7797c).

There are two files that are currently at zero coverage, which are dragging the average coverage way down:

internal/version/version.go (22 lines)
cmd/main.go (44 lines)

Between the two of them they contribute 66 of 82 (80%) of the missed lines in the project.

I've managed to add a bit of test coverage to "version.go" in PR #15, but a lot of the magic debug.BuildInfo() VCS data is not testable right now, as far as I can tell (See the comment in "version_test.go").

Which brings me to "cmd/main.go". I did a fair bit of Googling today on how to test a Go main() function, and the general consensus seems to be don't do that, at least not in unit tests. Accepting this community wisdom means we've got 44 lines of code (out of 185 in the project) in "cmd/main.go" that are currently untestable. The same folks that say "don't test main()!" also say "keep main() as small as possible, and do all the work in other parts of the code that you can test".

So this PR was a first attempt at reducing the complexity of main() so we have less untested code. I thought that switching to the blocking worker.Run() was a good option because:

We aren't running any other goroutines right now in main or workercmd. The current code (as I understand it) sets up a context that cancels when as system interrupt is receives, starts workercmd.Run() in a goroutine using that context, and then immediately blocks until the context cancels. This seems like unnecessary complexity in service to creating what is effectively a blocking call of workercmd.Run().
You have told me in the past that it's better to use synchronous code when you can because it's less complicated, usually it's good enough, and that this is becoming more accepted in the Go community. Right now I don't think we need asynchronous code here.
We can't test this code.

I'm open to the fact that we may want to run other goroutines when we start the worker, though in general I don't like the pattern in enduro-sdps of starting multiple services from one entry point. In any case, we aren't starting any other parallel tasks in the worker right now, so why include the extra complexity in the code? If we do want to start other services when we start the worker, I think it would be better to do that in workercmd or somewhere else where we can test the code.

djjuhasz · 2024-04-24T23:47:48Z

@sevein all that said, this PR broke the integration test, so I need to rethink it anyway. 🤔

sevein · 2024-04-25T08:28:57Z

Thank you for the added context. Maybe in a larger codebase (we're tracking 174 lines of code today), a 44-line main function wouldn't be such a big deal, but I'm supportive with adopting a new pattern with a thinner main too if we want to go down that route. There are really cool examples out there, from cockroach's 1-line main() to something like tailscaled that seems to deal mostly with arguments and env strings.

I recommend reevaluating the decision to omit passing the context from main() to other functions. Mat Ryer describes it well here. It'll make testing easier, which I think you're about to discover when you try to fix the integration test error reported in CI.
I would even argue that Mat's run() should not use os.Interrupt, that's also fundamental to me, a global that I'd rather pass from main. A lot of people set up the signal handler in main(). acorn does it beatifully via the k8s's SetupSignalHandler. A problem with temporalsdk_worker.InterruptCh() is that it's doing it implicitly which is cool for code examples but I'm not sure that it scales well.

I agree that enduro-sdps needs to be fixed, that I believe was the result of forking old legacy-enduro code. However the claim "we can't test main()" is oudated since go1.20 added support for building coverage-instrumented binaries (go build -cover) and I've seen projects going down that route instead, e.g. authelia. But it seems like it would add more complexity to the build system which I'd rather avoid. It's probably worth the effort for existing large projects where refactoring main() would be a big risk.

djjuhasz · 2024-04-25T17:11:50Z

@sevein thank you for your thoughtful feedback, and for all the great links to main() examples and the Mat Ryer article. I'm going to park this PR for now and work on other things. I would like to come back to this later, but I may end up just closing the PR if I can't justify the time investment.

djjuhasz · 2024-04-30T00:23:25Z

This PR needs a significant rework so I'm going to close it, and come back to it later if time permits.

Simplify worker runner

36b74d9

Use `temporalsdk_worker.Run()` blocking call with a system interrupt listening channel to simplify the `workercmd` and `main` (cli) code. [skip-codecov]

djjuhasz force-pushed the dev/simplify-worker-runner branch from d70dd61 to 36b74d9 Compare April 24, 2024 19:30

sevein reviewed Apr 24, 2024

View reviewed changes

djjuhasz closed this Apr 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify worker runner #16

Simplify worker runner #16

djjuhasz commented Apr 24, 2024

codecov bot commented Apr 24, 2024 •

edited

Loading

sevein left a comment

djjuhasz commented Apr 24, 2024 •

edited

Loading

djjuhasz commented Apr 24, 2024 •

edited

Loading

sevein commented Apr 25, 2024 •

edited

Loading

djjuhasz commented Apr 25, 2024

djjuhasz commented Apr 30, 2024

Simplify worker runner #16

Simplify worker runner #16

Conversation

djjuhasz commented Apr 24, 2024

codecov bot commented Apr 24, 2024 • edited Loading

Codecov Report

sevein left a comment

Choose a reason for hiding this comment

djjuhasz commented Apr 24, 2024 • edited Loading

djjuhasz commented Apr 24, 2024 • edited Loading

sevein commented Apr 25, 2024 • edited Loading

djjuhasz commented Apr 25, 2024

djjuhasz commented Apr 30, 2024

codecov bot commented Apr 24, 2024 •

edited

Loading

djjuhasz commented Apr 24, 2024 •

edited

Loading

djjuhasz commented Apr 24, 2024 •

edited

Loading

sevein commented Apr 25, 2024 •

edited

Loading