Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Job fuses / rapid job discard dashboard #749

Open
julik opened this issue Nov 14, 2022 · 1 comment
Open

Job fuses / rapid job discard dashboard #749

julik opened this issue Nov 14, 2022 · 1 comment

Comments

@julik
Copy link
Contributor

julik commented Nov 14, 2022

Working with large-ish background job setups I have seen a few common operational needs which arise from time to time. I've covered it in my talk here https://www.youtube.com/watch?v=aEVVbFn0_A4

The feature I would like to propose for GoodJob is "fusing" - discarding jobs of a certain ActiveJob class, having a certain parameter or belonging to a particular queue. A fairly common failure mode with job clusters is that one (or a few) specific types of jobs gobble up the entire queue capacity, and there is some reactive process (AWS notifications, cron scheduling, some web endpoint generating jobs etc...) which keeps adding those "poison pill" jobs into the queue. What has proven itself very useful is to have a way to temporarily discard any job of a certain shape which gets picked up for execution. Discarding-at-enqueue does not work quite well for this use case but can also be considered.

How this could work (MVP-style):

  • There is an extra panel in the GJ dashboard where you see a list of all AJ classes GoodJob knows about. The same facility could be used which generates the search popup in the main dashboard ("which job_class values are known to the system at this time?"). The panel has a "switch" UI control next to every job class
  • There is a table which workers keep in-memory and update from time to-time, or a table with which the workers JOIN when selecting jobs for execution. If the table contains a row (or a row with a particular value of the switch) the job does not gets selected for execution, or gets discarded immediately upon getting picked up

Post-MVP you would want to add parameter matching to the "fused" jobs ("discard all ProcessPayment jobs for user_id=123).

Is this something good_job would be interested in integrating if we provide an implementation?

@bensheldon
Copy link
Owner

@julik thanks for opening the issue! I watched the youtube video and I think it makes sense to me.

  • I'm open to this, though I think I'd want to keep it as simple as possible to see how it develops.
  • I agree that I think this should filter at execution, rather than enqueue.
  • There currently exists a good_job_settings table that contains open-ended configuration within json. I hope that this could be be used to store the properties that would be filtered. I think doing a JOIN is probably fine; cache invalidation is one of those hard problems. If the complexity of it can be encapsulated within an ActiveRecord scope, that would be ideal.

Some other thoughts:

  • I think a switch-list UI might be problematic. From watching the video, it sounds like one of the use cases is to pre-disable a job that will be rolled out during a deploy. That sounds like something that would be necessary to configure before the deploy when the system won't have knowledge of the job. Free-text field? 😬
  • In addition to job class, I wonder what other properties would be configured. I dunno what a good interface would be for filtering by argument. Also (blue sky), it makes me wonder whether jobs should be able to have arbitrary tags (though that might be a thought for your other comment on throttling). For example, I might tag every job that touches API Foo with "api_foo" and then be able to filter/throttle/etc. all of the jobs regardless of the job/queue 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Inbox
Development

No branches or pull requests

2 participants