Support multiple active GoodJob databases #725

doxavore · 2022-10-18T15:10:21Z

One of the great benefits that shake out of GoodJob is that enqueuing a job can be part of a database transaction that includes persisting the records that are needed when running the job.

Today, GoodJob supports multiple databases as long as GoodJob itself uses a single database (since active_record_parent_class was introduced in #238). It would be nice if we could extend the same guarantees we get with transactions and a single database to multiple databases. One could copy the GoodJob migrations into multiple migrations_paths, but we only have access to a single set of underlying GoodJob::BaseRecord-inheriting ActiveRecord models.

@bensheldon Do you have any thoughts on whether this is something we might support in GoodJob? I'd like to contribute in this area, but would want to make sure I'm doing so in a way that's consistent with your vision (perhaps for v3?).

The text was updated successfully, but these errors were encountered:

bensheldon · 2022-10-19T15:46:36Z

@doxavore that's interesting! And I think I'm pretty reluctant to implementing that.

Overall, I think I'm not sure about the applicability of the use-case to the level of effort to the ongoing complexity cost.

If I'm understanding, it would add a third database option:

(default) store GoodJob records in your app's primary/singular database
(optional) store GoodJob records in one of your app's partitioned databases
(proposed) store GoodJob records in each of your app's partitioned databases

I'm currently tepid even on #238. I think the use-case is around flexibility ("this is how our application is architected"), whereas 99% of the real-world interest/questions I experience are about performance. I want to be really transparent in my recommendation that if I was in a position to choose with an application that hit a true job performance bottleneck (talking tens of millions of jobs), I would choose Sidekiq Pro over partitioning GoodJob's database records. Real talk! 😄 I think at the scale where one is burning up their relational database to run jobs, they should reach for a better tool (Redis, Kafka, etc.) .

And I realize that the benefit you want is transactional consistency, so you can lightly gloss over the previous paragraph, but I wanted to share that regardless 😊

I guess I consider transactional consistency a nice to have side-effect than a core feature of GoodJob. There's maybe some interesting workarounds possible in #712. Like:

GoodJob::Bulk.defer do
  MyJob.perform_later(my_record)
  my_record.update
end # <= job doesn't actually get enqueued until block exits here

Sorry to maybe overshare. I am curious though about your use-case. Are you developing an application right now where this feature would be useful to you?

doxavore · 2022-10-19T16:42:28Z

Thanks for the context! Before opening this issue I'd poked around previous issues and saw you had some misgivings on the current functionality, so I was worried this may be a long shot.

Are you developing an application right now where this feature would be useful to you?

Yes! This is a known gotcha in a codebase today. We use multiple databases to support differently usage patterns and levels of reliability required in a single Rails app.

I think deferring may help if we were wanted to continue using a single database for GoodJob. Since we want consistency within a single database, though, I'm not sure it'll help our use case.

bensheldon · 2022-11-03T14:53:36Z

I have this idea churning away in the back of my mind. Still not planned, but thinking about it.

The way I'm thinking it could work would be for your application to create a subclass GoodJob::Execution and extend them for each of your database connections. I have been strongly thinking about moving to an extend strategy rather than base-class strategy for models (mentioned in #687 (comment))

Then, it would be necessary for you to inject those base classes into the Active Job adapter (e.g. instances of GoodJob::Adapter), and also inject them into a Manager each for the subclasses (described in #705).

I'm not quite sure how simple/clean it is to tell ActiveJob to enqueue a job to a specific Adapter instance (I don't think many people run multiple adapters). e.g. using ActiveJob's api would end up looking something like:

job = MyJob.new(args)
job.scheduled_at = 10.minutes.from_now # enqueuing options can only be set on the job instance
adapter = GoodJob::Adapter.new(job_record: CustomExecution)
adapter.enqueue_at(job, job.scheduled_at)

I dunno, seems complicated :-)

bensheldon added this to GoodJob Backlog v2 Oct 18, 2022

bensheldon moved this to Inbox in GoodJob Backlog v2 Oct 18, 2022

bensheldon mentioned this issue Jul 6, 2023

Deprecate GoodJob.active_record_parent_class in favor of include-based customization #1000

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support multiple active GoodJob databases #725

Support multiple active GoodJob databases #725

doxavore commented Oct 18, 2022

bensheldon commented Oct 19, 2022

doxavore commented Oct 19, 2022

bensheldon commented Nov 3, 2022

Support multiple active GoodJob databases #725

Support multiple active GoodJob databases #725

Comments

doxavore commented Oct 18, 2022

bensheldon commented Oct 19, 2022

doxavore commented Oct 19, 2022

bensheldon commented Nov 3, 2022