-
-
Notifications
You must be signed in to change notification settings - Fork 204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add an ActiveJob extension for throttling #315
Comments
why not use activejob-traffic-control? https://github.com/nickelser/activejob-traffic_control I guess it adds a redis/memcached dependency |
@34code thank for suggesting I'd like to build it into GoodJob because throttling seems like a useful function of concurrency control, as well as it could be powered by Postgres. |
Maybe not directly related, but in case you wanted to look into your own throttling algorithm, schneems wrote a nice article about it, although it not being about job scheduling. Overview: Experiments and measurements: |
@aried3r thank you for sharing that! It's helpful to have the strategies enumerated. The strategy I'm thinking of is a moving window, because I think that would be easiest to implement in Postgres.
And then the query would be like: if GoodJob::Execution.where("created_at > ?", period.ago).count < limit
enqueue_job # or perform it
end I'm still stuck on how to do a better job of managing |
We already have a leaky-bucket rate limiter which is Postgres-based, it works quite well. If we were to open-source it, could good_job take it on as a dep? |
@julik I have a high bar for pulling in additional dependencies, but I'm curious about your queries/algorithm. |
It is described here http://live.julik.nl/2022/08/the-unreasonable-effectiveness-of-leaky-buckets and roughly does this: def fillup(n_tokens)
conn = ActiveRecord::Base.connection
# Take double the time it takes the bucket to empty under normal circumstances
# until the bucket may be deleted.
may_be_deleted_after_seconds = (@capacity.to_f / @leak_rate.to_f) * 2.0
# Create the leaky bucket if it does not exist, and update
# to the new level, taking the leak rate into account - if the bucket exists.
query_params = {
name: @key,
capa: @capacity.to_f,
delete_after_s: may_be_deleted_after_seconds,
leak_rate: @leak_rate.to_f,
fillup: n_tokens.to_f
}
sql = ActiveRecord::Base.sanitize_sql_array([<<~SQL, query_params])
INSERT INTO leaky_buckets AS t
(name, atime, expires_at, level)
VALUES
(
:name,
clock_timestamp(),
clock_timestamp() + ':delete_after_s second'::interval,
LEAST(:capa, :fillup)
)
ON CONFLICT (key) DO UPDATE SET
atime = EXCLUDED.atime,
expires_at = EXCLUDED.may_be_deleted_after,
level = GREATEST(
0.0, LEAST(
:capa,
t.level + :fillup - (EXTRACT(EPOCH FROM (EXCLUDED.atime - t.atime)) * :leak_rate)
)
)
RETURNING level
SQL
# Note the use of .uncached here. The AR query cache will actually see our
# query as a repeat (since we use "select_value" for the RETURNING bit) and will not call into Postgres
# correctly, thus the clock_timestamp() value would be frozen between calls. We don't want that here.
# See https://stackoverflow.com/questions/73184531/why-would-postgres-clock-timestamp-freeze-inside-a-rails-unit-test
level_after_fillup = conn.uncached { conn.select_value(sql) }
State.new(level_after_fillup, (@capacity - level_after_fillup).abs < 0.01).tap do
# Prune buckets which are no longer used. No "uncached" needed here since we are using "execute"
conn.execute("DELETE FROM leaky_buckets WHERE expires_at < clock_timestamp()")
end
end We are not using it in |
just wanted to add a tiny little bit of input regarding the throttleing: i often times find myself in a situation where there might be say 10 different apis with 10 different limits for, say, historical data. and often i need to call all of them to download the data. I tend to do this via a "meta job" that loops through all things needed and schedules a "sub job" thats actually doing the data download. This sub job therefor calls the different apis (with the different rate limits). In sidekiq i ended up using multiple queues but even that did not really fly and i had to use one sidekiq process per queue to fully isolate everything. only then did it really honor the limits imposed. i could use different jobs for the different limits but this somehow feels wrong and is a slap in the face of the dry principle. so, it would really be cool if the throttling could be per queue instead of per job, i think(?) thank you guys for considering this. |
AFAIK this happens with https://github.com/nickelser/activejob-traffic_control, too. I hope another solution can be found, because otherwise the db could get really polluted, for example in such a scenario:
This would create a massive number of Executions if every try creates a new record. Maybe you have some ideas how to overcome this (probably first for the concurrency feature)? Maybe the |
To cross-link efforts: there is some good code cooking over in #1198 |
For what it's worth, maybe this concern can be external to ActiveJob or to GoodJob. We have released our rate limiter as https://github.com/cheddar-me/pecorino and it is working very well. You could easily do something like this with Pecorino: class AmazingJob < ApplicationJob
around_perform do |job, block|
t = Pecorino::Throttle.new(key: "#{job.class}_job_throttle", over_time: 1.minute, capacity: 1)
t.request!
block.call
rescue Pecorino::Throttled => e
job.set(wait: e.retry_after).perform_later
end
# ...
end /cc @Marcovecchio |
Given #1270 can this be closed? |
Suggested on Reddit.
I think this could be done as a
before_perform
hook in which GoodJob queries the previous time-period for a count, and if the value is met/exceeded, then retries the job with an incremental backoff, similar to how the Concurrency extension does it:good_job/lib/good_job/active_job_extensions/concurrency.rb
Lines 44 to 46 in d365e3e
I think this maybe should be part of the Concurrency extension so that they would re-use the same key. So maybe it would be:
The text was updated successfully, but these errors were encountered: