Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unique constraint violation messages in postgres error logs #1536

Open
betmenslido opened this issue Nov 7, 2024 · 6 comments
Open

Unique constraint violation messages in postgres error logs #1536

betmenslido opened this issue Nov 7, 2024 · 6 comments

Comments

@betmenslido
Copy link

betmenslido commented Nov 7, 2024

Hello and thanks for your great work!

We are using latest GJ (multiple workers) and Googles cloudSQL postgres with it, and we got steady stream of unique constraint violation errors:

log: cloudsql.googleapis.com/postgres.log
error: duplicate key value violates unique constraint "index_good_jobs_on_cron_key_and_cron_at_cond"

We get around 15 of those every 10 minutes. Nothing seems out of order when looking at GJ logs.

Would like to confirm that this is by design and nothing to worry about.
Thank you!

@Earlopain
Copy link
Collaborator

Earlopain commented Nov 7, 2024

Hi! This is expected. Here is a previous issue about this: #1230

good_job relies on database constraints to ensure that cron jobs are only executed once. When you have multiple workers running, they will all attempt to enqueue the cron job.
Only one will succeed, the others will fail because of the constraints. good_job rescues the exception from ActiveRecord but postgres will log it regardless.

Hope that makes sense. This is also briefly documented here: https://github.com/bensheldon/good_job?tab=readme-ov-file#cron-style-repeatingrecurring-jobs

We can keep this issue open to better document this in the readme.

@ccouton
Copy link
Contributor

ccouton commented Nov 14, 2024

Hi, I have the same issue even for one process. After deep diving I think I found the potential reason. Here https://github.com/bensheldon/good_job/blob/main/lib/good_job/cron_manager.rb#L88 the delay can return a too short time and so the task execution is just before the next_at and so the next delay calculation can return something close to 0.01s ?

@ccouton
Copy link
Contributor

ccouton commented Nov 14, 2024

FYI with this simple test :

require 'concurrent-ruby'
require 'time'
require 'fugit'

class TestFugit

  def initialize
    @fugit = Fugit.parse('*/1 * * * *')
  end

  def create_task
    cron_at = @fugit.next_time.to_t
    puts Time.now.to_f
    delay = [(cron_at - Time.now).to_f, 0].max
    puts "delay = #{delay}"

    future = Concurrent::ScheduledTask.new(delay, args: [self]) do |tf|
      tf.create_task
      puts "Exec task #{Time.now.to_f}"
    end

    future.execute
  end
end
TestFugit.new.create_task

sleep 1000

I reproduce on my fly.io server

1731617520.0011902
delay = 59.998755028
Exec task 1731617520.001527
1731617579.9996374
delay = 0.000323485
Exec task 1731617579.9999614
1731617580.0002928
delay = 59.999671595
Exec task 1731617580.0005002

delay is 0

Didn't reproduced it on my mac, in docker on my mac, in docker on aws ec2, so it's linked to the fly.io virtualisation I think.

@bensheldon
Copy link
Owner

@ccouton oh! I think I know what's happening in your example 🤝

Concurrent Ruby has a magic number (0.01) such that if you try to schedule a task with a delay less than 0.01 seconds, it will just execute the task immediately:

https://github.com/ruby-concurrency/concurrent-ruby/blob/56227a4c3ebdd53b8b0976eb8296ceb7a093496f/lib/concurrent-ruby/concurrent/executor/timer_set.rb#L97

And also maybe there is a little bit of clock drift or something too.

We should probably check that the time both actually >= to the expected cron_at before trying to execute the task and also ensure that we don't delay for less-than 0.01s.

Would you want to try a PR for that? Otherwise I can get to it.

@ccouton
Copy link
Contributor

ccouton commented Nov 15, 2024

will do it today

@ccouton
Copy link
Contributor

ccouton commented Nov 15, 2024

⬆️ is it something acceptable ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Inbox
Development

No branches or pull requests

4 participants