-
-
Notifications
You must be signed in to change notification settings - Fork 211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Concurrency's enqueue_limit should exclude performing jobs from count #317
Comments
Yep! GoodJob's ActiveJob Concurrency extension can do this with It's technically possible to exclude executing jobs. Could you make that argument? I'm open to changing it. If you wanted to monkeypatch your own use instead, chain in this line to include
And then: class MyJob < ApplicationJob
include GoodJob::ActiveJobExtensions::Concurrency
good_job_control_concurrency_with enqueue_limit: 1, key: -> { "Unique-#{arguments.first}" }
def perform(first_name)
# do work
end
end FYI, You can test the resulting key that is generated this way (I should add this to the readme): job = MyJob.perform_later("Something")
job.good_job_concurrency_key #=> "Unique-Something" |
By the way, I tried to follow the semantics of this thread upstream on Rails. I'd want to stay in alignment with that proposal, but I don't think they specified this: rails/rails#40337 |
Ah, I missed the "concurrency" part in the Readme. I searched for "duplicate" and "double" and figured there's nothing mentioning duplicate jobs. It's interesting that at Shopify/Rails they also chose to use the word "concurrency", so I guess there's nothing to change in this gem. Perhaps we could only add a hint that this could be used to prevent enqueueing the same job twice.
This is a race condition.
While this is happening, after 1) and before/during 2), user makes changes to the record and saves the record. Expected: The new record changes get synced to a third-party API. Actual: Concurrency algorithm finds the currently executing job with the same arguments and decides to skip the enqueuing of the job. The new record changes don't get synced. I guess I should give it a try and let you know how it all works. |
That's a compelling reason. And I remembered the reason why I initially counted executing jobs: retrying on errors. Example: job1 is executing, job2 is enqueued and the constraint of
I lean towards the first case (re-enqueue and ignore the constraint), but I'm curious about your perspective. |
FYI, I updated the Readme with additional keywords (duplicate, double) and the testing example 🎉 |
If job1 errors, I'd assume it goes back to the end of the queue. So job2 will start executing before job1. For "sync" purposes, it's important that at least one job starts after the record has been last saved. There might be some other cases where it's better to try to remove the duplication. I'd argue that delivering the same message twice is better than skipping executing a job because there's "the same job" already running. This is an edge-case where I'd probably go for simplicity: use the same code that is already there. If job1 errors, it would be the best if it would be removed, if job2 started running after job1 errored. But, for the sake of simplicity, I'd allow job1 to be added to the queue if the check for duplicates says there are no duplicates (job2 has the same arguments, but it's currently executing) |
I agree with this. I think that's saying "running a job incompletely is more problematic than running a 2nd job with duplicate conditions." To move forward, I'm thinking:
|
Sounds perfect! Thanks @bensheldon! :) I'm curious to see what the rails team will do with their concurrency solution. |
When I first started playing around with the concurrency extension I was mildly surprised when I realized that For our use case we are running jobs that sync user data against external APIs in response to webhooks. We wanted to ensure that only one instance of the same sync job is ever running at the same time to avoid multiple simultaneous syncs interfering with each other and so we have This effectively serves as a kind of debounce on the trailing edge, since it ensures that we always pick up new changes when they happen without running multiple unnecessary syncs, but it doesn’t function across a time window. I can imagine that in the future with heavier load we may need a time-based throttle like that discussed in #315, though. |
Hey Ben!
Thanks for the great gem! I just discovered it and it seems to be an improvement on top of DelayedJob which I'm using.
Currently, I have some custom duplication-prevention code that prevents adding a job with the same arguments to the queue if such job (with same arguments) already exists in the queue and is not currently running. (If a job with the same arguments is currently running, we allow adding a job to the queue.)
Is there any such solution for GoodJob? How would I go about adding support for this myself?
Thanks!
The text was updated successfully, but these errors were encountered: