Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification on max_threads option #538

Open
hmnhf opened this issue Mar 4, 2022 · 4 comments
Open

Clarification on max_threads option #538

hmnhf opened this issue Mar 4, 2022 · 4 comments

Comments

@hmnhf
Copy link

hmnhf commented Mar 4, 2022

Hi and thanks for this awesome gem!

I've become a bit confused about the max_threads option.

From reading the following two sections, at first I thought max_threads is a global option that defines the maximum number of threads across all queues. (Meaning that if there are two queues named A and B, their combined total threads' count won't exceed the number of max_threads.)

[From command-line options] Maximum number of threads to use for working jobs.

[From configuration options] sets the maximum number of threads to use when execution_mode is set to :async.

But after reading the following two sections, I figured it's probably the default max value for each queue's threads:

[In pool definition with queues] <participating_queues>:<thread_count> ... <thread_count>: a count overriding for this specific pool the global max-threads.

[In configuring database pool size] 1 connection per query pool thread e.g. --queues=mice:2;elephants:1 is 3 threads. Pool thread size defaults to --max-threads.

But, then again in the Database Connections section, there's the following part which gives the impression that max_threads is the max number of threads across all different queues.

pool: <%= ENV.fetch("RAILS_MAX_THREADS", 5).to_i + (ENV.fetch("GOOD_JOB_MAX_THREADS", 4).to_i %>

And then, there's the description on how to calculate the number of required threads by GoodJob to be set as GOOD_JOB_MAX_THREADS .

Assuming that I've understood this correctly, I think the confusion comes from the fact that the max_threads option and the GOOD_JOB_MAX_THREADS used in the pool size setting can be two different values. In other words, their name could be default_max_threads_per_queue and GOOD_JOB_REQUIRED_THREADS.
Have I misunderstood something or is this correct?

@bensheldon
Copy link
Owner

@hmnhf thanks for opening this Issue! The Readme was recently updated to try to explain how to calculate total threads (#525) and I think it exposed a conceptual problem.

Briefly to answer your question: GOOD_JOB_MAX_THREADS is "default threads per query execution pool"

The name predates the ability to configure multiple query pools within a single process (e.g. the --queues=mice:2;elephants:1 syntax). The value in --queues overrides the max-threads value.

The example given for defining database.yml's pool: value is really just flat out wrong.

To address this:

  • I'll do another pass on the Readme
  • It makes me wonder if I can calculate a GoodJob.max_threads value that would be accurate at that stage of Rails initialization (it gets tricky because configuration gets loaded at different places and it might not be available at the stage that database.yml is read)
  • I maybe should deprecate the GOOD_JOB_MAX_THREADS and --max-threads configuration options and replace them with something less absolute (e.g. GOOD_JOB_POOL_THREADS)

@philipqnguyen
Copy link

philipqnguyen commented Apr 18, 2024

@bensheldon so if I have the following queue "high_priority:7;default:4;low_priority:2;*" and GOOD_JOB_MAX_THREADS = 5 that means:

7 threads for high priority
4 threads for default
2 threads for low priority
5 threads for * (This '5' comes GOOD_JOB_MAX_THREADS).
Totaling 18 threads due to the queue.

Additionally, goodjob needs
1 thread for a notifier.
1 thread for a cron.
1 thread for executor.
totaling 3 threads as overhead for goodjob.

With GoodJob running separately from the web process, based on the above example the database.yml should have:

pool: 21

Is that right? I spent the last couple hours reading through the readme and various issues, and that's what I have deduced....

@bensheldon
Copy link
Owner

@philipqnguyen yep, that's correct number of threads GoodJob will need from the database and the minimum value you should safely have in your database.yml.

I'm also recommending that people don't set the minimum but rather just set like 50 or 100 and don't worry about it from the perspective of the application. You'll need to have those connections available from the database, but trying to set a minimum in the database.yml isn't necessary and can lead to not having enough database connections available.

For example, if you're using Active Record load_async in any jobs to further parallelize work, you'll run out of database connections in Active Records connection pool.

@philipqnguyen
Copy link

Thank you for confirming @bensheldon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Prioritized Backlog
Development

No branches or pull requests

3 participants