Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pause or cancel a batch? #919

Open
ajselvig opened this issue Apr 6, 2023 · 5 comments
Open

Pause or cancel a batch? #919

ajselvig opened this issue Apr 6, 2023 · 5 comments

Comments

@ajselvig
Copy link

ajselvig commented Apr 6, 2023

I'm using the new batches functionality and it's working well. However, I haven't been able to find a way to pause or cancel the execution of a particular batch. I tried setting the finished_at value but that didn't have any effect.

I could do a bulk update on all jobs in the batch, but even then it's not clear what I should be setting. Any guidance would be appreciated.

Thanks!

@bensheldon
Copy link
Owner

Great feature request. Something like that doesn't currently exist.

Currently, it's possible to manually Discard each of the jobs within a batch, but that would still result in the Batch Callback(s) being triggered.

There aren't any manual Batch operations right now that are analogous to a Job being able to be manually retried/rescheduled/discarded.

Thinking aloud here on designing something new:

  • A batch could be "Discarded" which would discard all of the jobs that have not yet begun executing. Manually discarding the batch would not trigger any batch callbacks.
  • Discarding a batch would Discard all of the jobs that haven't yet begun executing.
  • Un-discarding a batch would Retry all of the jobs in the batch that had been Discarded.
  • Would you want to track Manually Discarded batches on their own? I'm thinking we'd want to introduce a new column like batch_discarded_at to differentiate between Batches that are "Discarded" because a job within the batch was discarded (and thus would trigger callbacks), and Batches that are Manually Discarded as a whole (and thus not trigger callbacks). Maybe that deserves a new term like "Cancel"?

@bensheldon
Copy link
Owner

If you simply want a workaround, you could discard all of the jobs this way:

# Uses private APIs, so no promises this will continue working
batch._record.jobs.where(performed_at: nil).each do |good_job|
  good_job.discard_job("Manually discarded within batch")
end

@bensheldon
Copy link
Owner

...and now that I'm looking through the manual job actions GoodJob::Job#discard_job and GoodJob::Job#retry_job I'm noticing that they aren't batch aware and don't trigger batch callbacks. That could be a bug, or just YAGNI until someone says they want that. I'm imagining that they should trigger callbacks though.

@ajselvig
Copy link
Author

ajselvig commented Apr 6, 2023

Wow, thanks for the extremely quick and thorough response!

Would you want to track Manually Discarded batches on their own?

Definitely seems like a different concept. For our use case, at least, the idea that a batch gets "discarded" just because one of it's jobs is discarded isn't super meaningful. If we queue up 10,000 invoices to be sent and 20 of them fail for some reason, the batch itself was still a success. That's fundamentally different than a user explicitly cancelling the batch.

If you simply want a workaround, you could discard all of the jobs this way:

I'm fine with workarounds but this seems quite heavyweight for a large number of jobs. Seems like this would work fine as well?

batch._record.jobs.where(performed_at: nil).update_all {finished_at: Time.now, error: 'cancelled'}

I'm noticing that they aren't batch aware and don't trigger batch callbacks...

Batch callbacks don't seem necessary to me for this use case, but I agree that others might want it.

I'll move forward with this approach for now. If you're actually interested in implementing this as a first class feature, it seems like being able to toggle some sort of state on the batch (active | paused | cancelled) without needing to update the job records themselves would be ideal when dealing with large batches.

Thanks again!

@bensheldon
Copy link
Owner

For our use case, at least, the idea that a batch gets "discarded" just because one of it's jobs is discarded isn't super meaningful. If we queue up 10,000 invoices to be sent and 20 of them fail for some reason, the batch itself was still a success. That's fundamentally different than a user explicitly cancelling the batch.

That sounds right to me. I think "Batch -> Discarded" is pretty ambiguous currently because it really means "a job within the batch was discarded, triggering the discarded batch callback".

I'm trying to use as few words as possible here to denote states :-) But maybe "Cancel" is the right word. I'll have to think about that some more. (same problem as #890).

I'm fine with workarounds but this seems quite heavyweight for a large number of jobs. Seems like this would work fine as well?

Yep, that code snippet will work. The one difference is that the discard_job method takes a lock so it's guaranteed that the job won't have started executing, whereas there is a slight gap between when a job is locked and when performed_at is touched, so just doing an update_all has a tiny, tiny chance of having the job execute despite being marked as finished.

I'll move forward with this approach for now. If you're actually interested in implementing this as a first class feature, it seems like being able to toggle some sort of state on the batch (active | paused | cancelled) without needing to update the job records themselves would be ideal when dealing with large batches.

I agree that would be ideal, it's also something I'm trying to avoid: adding transitive job states (same problem as #749).

Sorry, that was a lot of thinking while typing! Your feedback is exactly what I want. I think this is a pretty important User Action to support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Inbox
Development

No branches or pull requests

2 participants