-
-
Notifications
You must be signed in to change notification settings - Fork 204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Batches don't work in production #1555
Comments
hmm, that's really strange! How are the job and batch records being tenant'ed? I could imagine that maybe the jobs and batch records are being placed on a different database, and thus aren't able to be queried from the current context. |
Every table has a tenant_id column. We use a single database and rely on row-level security. The name The strange thing is that regular jobs always work and are processed immediately. This remains true even when there is a stuck or queued job that was created via a batch earlier. This is strange because I would expect the scheduler to be implemented as a queue data structure, with no skipping. I tried searching the project for differences in how the jobs work in development and production but couldn't find any significant differences. Edit: |
Rails version 7.1.4
Ruby version 3.3.5
GoodJob version 4.3.0
Hello,
first of all, we are using multitenancy with RLS, so it is possible that we did mess something up. The weird thing is that batches work perfectly in development mode with both
async
andinline
adapters.Here is how we monkey-patched the
JobPerformer
This has so far worked flawlessly with regular jobs.
Now, we have introduced a complex batch (similar to https://github.com/bensheldon/good_job?tab=readme-ov-file#complex-batches )
Then we have a batch job like this
Which we invoke from our controller by calling
This works perfectly for me locally. Once we deploy this to an instance, the batch only runs once (no error is raised, it completes successfully). The second time you run a batch, the first job (
FinalizeBillingRun
with nil stage) will be queued but never picked up by the Scheduler. It will hang as pending/queued forever - or, funnily enough, until we restart the instance. Then it gets picked up immediately and completes without an error.We are using
puma
and have implemented the suggested changes from https://github.com/bensheldon/good_job?tab=readme-ov-file#execute-jobs-async--in-process andasync
adapter in productionI would very much appreciate if you could point out what could have gone wrong.
The text was updated successfully, but these errors were encountered: