Concurrency's enqueue_limit should exclude performing jobs from count #317

vfonic · 2021-08-01T15:09:37Z

Hey Ben!

Thanks for the great gem! I just discovered it and it seems to be an improvement on top of DelayedJob which I'm using.

Currently, I have some custom duplication-prevention code that prevents adding a job with the same arguments to the queue if such job (with same arguments) already exists in the queue and is not currently running. (If a job with the same arguments is currently running, we allow adding a job to the queue.)

Is there any such solution for GoodJob? How would I go about adding support for this myself?

Thanks!

bensheldon · 2021-08-01T17:31:21Z

Yep! GoodJob's ActiveJob Concurrency extension can do this with enqueue_limit. But it does not do the "and is not currently running" as enqueue_limit currently includes executing jobs.

It's technically possible to exclude executing jobs. Could you make that argument? I'm open to changing it.

If you wanted to monkeypatch your own use instead, chain in this line to include .advisory_unlocked scope:

good_job/lib/good_job/active_job_extensions/concurrency.rb

Line 24 in d365e3e

    
           enqueue_concurrency = GoodJob::Job.unscoped.where(concurrency_key: key).unfinished.count

And then:

class MyJob < ApplicationJob
  include GoodJob::ActiveJobExtensions::Concurrency

  good_job_control_concurrency_with enqueue_limit: 1, key: -> { "Unique-#{arguments.first}" }

  def perform(first_name)
    # do work
  end
end

FYI, You can test the resulting key that is generated this way (I should add this to the readme):

job = MyJob.perform_later("Something")
job.good_job_concurrency_key #=> "Unique-Something"

bensheldon · 2021-08-01T17:34:55Z

By the way, I tried to follow the semantics of this thread upstream on Rails. I'd want to stay in alignment with that proposal, but I don't think they specified this: rails/rails#40337

vfonic · 2021-08-02T08:26:05Z

Ah, I missed the "concurrency" part in the Readme. I searched for "duplicate" and "double" and figured there's nothing mentioning duplicate jobs. It's interesting that at Shopify/Rails they also chose to use the word "concurrency", so I guess there's nothing to change in this gem. Perhaps we could only add a hint that this could be used to prevent enqueueing the same job twice.

It's technically possible to exclude executing jobs. Could you make that argument? I'm open to changing it.

This is a race condition.
Imagine you have a job that takes a couple of seconds to run. It's used to save the record JSON to a third-party API after every update.

First, it fetches a record from the database
Then it pushes the record JSON to a slow third-party API

While this is happening, after 1) and before/during 2), user makes changes to the record and saves the record.

Expected:

The new record changes get synced to a third-party API.

Actual:

Concurrency algorithm finds the currently executing job with the same arguments and decides to skip the enqueuing of the job. The new record changes don't get synced.

I guess I should give it a try and let you know how it all works.

bensheldon · 2021-08-02T14:16:47Z

This is a race condition.

That's a compelling reason.

And I remembered the reason why I initially counted executing jobs: retrying on errors. Example: job1 is executing, job2 is enqueued and the constraint of enqueue_limit: 1 is preserved. If job1 errors and is attempted to be retried, it will hit the constraint that prevents it from being re-enqueued. In this situation, what should happen?

job1 is re-enqueued, ignoring the constraint (2 matching jobs are now enqueued)
do not re-enqueue job1. How to preserve/indicate why when inspecting the job/Dashboard.

I lean towards the first case (re-enqueue and ignore the constraint), but I'm curious about your perspective.

Connecys to #317.

bensheldon · 2021-08-02T14:29:27Z

FYI, I updated the Readme with additional keywords (duplicate, double) and the testing example 🎉

vfonic · 2021-08-02T14:31:33Z

If job1 errors, I'd assume it goes back to the end of the queue. So job2 will start executing before job1.

For "sync" purposes, it's important that at least one job starts after the record has been last saved.

There might be some other cases where it's better to try to remove the duplication. I'd argue that delivering the same message twice is better than skipping executing a job because there's "the same job" already running.

This is an edge-case where I'd probably go for simplicity: use the same code that is already there. If job1 errors, it would be the best if it would be removed, if job2 started running after job1 errored. But, for the sake of simplicity, I'd allow job1 to be added to the queue if the check for duplicates says there are no duplicates (job2 has the same arguments, but it's currently executing)

bensheldon · 2021-08-02T20:13:47Z

There might be some other cases where it's better to try to remove the duplication. I'd argue that delivering the same message twice is better than skipping executing a job because there's "the same job" already running.

I agree with this. I think that's saying "running a job incompletely is more problematic than running a 2nd job with duplicate conditions."

To move forward, I'm thinking:

change enqueue_limit's count to exclude executing jobs
allow the enqueue_limit constraint to be ignored when retrying jobs, which can lead to there being multiple enqueued jobs with the same conditions.
Update the readme to note this, and recommend that if using an enqueue_limit: 1 to also set perform_limit: 1 to guard against a retried job executing at the same time as another job with duplicate conditions was enqueued.

vfonic · 2021-08-03T10:06:22Z

Sounds perfect! Thanks @bensheldon! :)

I'm curious to see what the rails team will do with their concurrency solution.

codyrobbins · 2021-08-13T17:41:21Z

When I first started playing around with the concurrency extension I was mildly surprised when I realized that enqueue_limit included performing jobs, but I don’t have a strong preference either way in terms of whether it should or not. I think you could make an argument for either option because you can achieve the same functionality either way, and the current behavior makes sense when you realize what’s going on, although excluding performing jobs has the advantage of avoiding violating the principle of least surprise.

For our use case we are running jobs that sync user data against external APIs in response to webhooks. We wanted to ensure that only one instance of the same sync job is ever running at the same time to avoid multiple simultaneous syncs interfering with each other and so we have perform_limit set to 1. If a change happens in the midst of an already running sync, though, we want to run another sync immediately after the first one completes in order to pick up those new changes. If multiple changes occur during a running sync, though, we never need to enqueue more than one additional sync job because that one sync job will pick up all subsequent changes when it runs after the current one. So, we have enqueue_limit set to 2.

This effectively serves as a kind of debounce on the trailing edge, since it ensures that we always pick up new changes when they happen without running multiple unnecessary syncs, but it doesn’t function across a time window. I can imagine that in the future with heavier load we may need a time-based throttle like that discussed in #315, though.

bensheldon added a commit that referenced this issue Aug 2, 2021

Add keywords to ActiveJob Concurrency and add testing example

1808433

Connecys to #317.

bensheldon mentioned this issue Aug 13, 2021

Race conditions in ActiveJob concurrency extension #325

Closed

bensheldon changed the title ~~Is there any support from preventing adding duplicate job to queue?~~ Concurrency's enqueue_limit should exclude performing jobs from count Aug 16, 2021

bensheldon added the enhancement New feature or request label Aug 16, 2021

bensheldon mentioned this issue Aug 18, 2021

Exclude executing jobs from Concurrency's enqueue_limit's count #342

Merged

bensheldon closed this as completed in #342 Aug 24, 2021

antulik mentioned this issue Sep 8, 2021

Concurrency control for all queued jobs #366

Closed

bensheldon mentioned this issue Sep 15, 2021

Race condition with concurency control #378

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Concurrency's enqueue_limit should exclude performing jobs from count #317

Concurrency's enqueue_limit should exclude performing jobs from count #317

vfonic commented Aug 1, 2021

bensheldon commented Aug 1, 2021

bensheldon commented Aug 1, 2021

vfonic commented Aug 2, 2021 •

edited

Loading

bensheldon commented Aug 2, 2021

bensheldon commented Aug 2, 2021

vfonic commented Aug 2, 2021

bensheldon commented Aug 2, 2021

vfonic commented Aug 3, 2021

codyrobbins commented Aug 13, 2021

Concurrency's enqueue_limit should exclude performing jobs from count #317

Concurrency's enqueue_limit should exclude performing jobs from count #317

Comments

vfonic commented Aug 1, 2021

bensheldon commented Aug 1, 2021

bensheldon commented Aug 1, 2021

vfonic commented Aug 2, 2021 • edited Loading

bensheldon commented Aug 2, 2021

bensheldon commented Aug 2, 2021

vfonic commented Aug 2, 2021

bensheldon commented Aug 2, 2021

vfonic commented Aug 3, 2021

codyrobbins commented Aug 13, 2021

vfonic commented Aug 2, 2021 •

edited

Loading