Fix cron double-enqueue because delay close to 0.01 and possibly clock-drift #1543

ccouton · 2024-11-15T15:31:11Z

Fixes one problem in #1536

Case of clock drift and delay calculation result close to 0.001 so task is executed immediately which causes an unnecessary duplicate key value violates unique constraint-error even on single-process installations.

ccouton · 2024-11-15T16:43:47Z

@bensheldon thanks for your help.
If we have 0.02 it will be the same no? it will execute immediately after ?

bensheldon · 2024-11-15T16:47:20Z

@ccouton doublecheck my math, but I believe this line should prevent Concurrent Ruby from ever immediately running the task:

https://github.com/bensheldon/good_job/pull/1543/files#diff-66b4fe9e81a6b387c63fb9fbea11ffd29405422be569fee3e4e125e647b820e1R92-R94

ccouton · 2024-11-15T16:49:30Z

Yes but it will be scheludled 0.02s after so a risk to be before the next at and so at the same of the previous at

bensheldon · 2024-11-15T16:55:16Z

oh right! we do want to run it immediately if it is in fact in the past. I'll tweak that.

ccouton · 2024-11-15T17:09:09Z

Sorry i'm not clear enough, we have the case when clock drift cause an execution just before the real next scheduled at. Here i think this issue is not fixed. And maybe not easy to fix. We have ti execute it immeditaly if it's the first execution but if it's an other if cron_at is leather than current for me there is no reason to execute it?

bensheldon · 2024-11-15T17:25:57Z

@ccouton this should (theoretically, assuming I did it right) prevent the duplicate key value violates unique constraint error you shared in the previous example. The duplicate key value/column uses the scheduled/intended time (so on 60.0) rather than on the actual time (maybe 60.0000023). So if it is scheduled to run at 60.0 but it actually runs at 61.000035 and the next run is at 61.0 you will not get a duplicate key error.

I don't have a strategy for preventing the case where some delay causes a previously scheduled time to run at about the same time as the next scheduled time. I would say that if your schedule is at such a cadence that you frequently run into that (like every second), cron is maybe the wrong tool for that and it would be better to manage your own sleep loop.

ccouton · 2024-11-15T21:21:26Z

maybe we can compare at the entry of the scheduled task that the thr_cron_at and previously_at are not equal ? and if it's the case we don't enqueue and reschedule the task again? wdyt?

ccouton · 2024-11-15T21:52:03Z

I tested it locally in my test.rb and it solves the issue

  def create_task(previous_cron_at = nil)
    cron_at = @fugit.next_time.to_t
    puts Time.now.to_f
    delay = [(cron_at - Time.now).to_f, 0].max
    puts "delay = #{delay}"

    future = Concurrent::ScheduledTask.new(delay, args: [self, cron_at, previous_cron_at]) do |tf, cron_at, previous_cron_at|
      puts "previous_cron_at / cron_at (#{previous_cron_at} / #{cron_at})"
      if previous_cron_at == cron_at
        puts "!!!! previous_cron_at == cron_at !!!! (#{previous_cron_at} #{cron_at})"
        tf.create_task(previous_cron_at)
      else
        tf.create_task(cron_at)
        puts "Exec task #{Time.now.to_f}"
      end
    end

    future.execute
  end

bensheldon · 2024-11-16T14:39:53Z

I don't like guarding against previous_cron_at == cron_at because that's not the same thing as accounting for Concurrent Ruby's < 0.1 second execution behavior or for clock drift. And it's a behavior change: it's possible for someone to use a Proc instead of Fugit and while I think it's wild to provide the same time twice, I don't want to change the contract around what runs.

I wanted to ask, do you not like what I've proposed:

Because it doesn't work?
Or because it's too complex/ugly/confusing?

This is the version of your script with my strategy (though I have still haven't been able to reproduce the behavior you're seeing when I run the script):

require 'concurrent-ruby'
require 'time'
require 'fugit'

class TestFugit
  def initialize
    @fugit = Fugit.parse('*/1 * * * *')
  end

  def create_task(at: nil)
    cron_at = at || @fugit.next_time.to_t

    puts Time.now.to_f
    delay = cron_at <= Time.now ? 0.0 : [(cron_at - Time.now).to_f, 0.02].max
    puts "delay = #{delay}"

    future = Concurrent::ScheduledTask.new(delay, args: [self, cron_at]) do |tf, thr_cron_at|
      if thr_cron_at > Time.now
        puts "Too soon, rescheduling"
        tf.create_task(at: thr_cron_at)
      else
        tf.create_task
        puts "Exec task #{Time.now.to_f}"
      end
    end

    future.execute
  end
end
TestFugit.new.create_task

sleep 1000

ccouton · 2024-11-18T20:11:26Z

seems working ok for that solution

ccouton · 2024-11-20T07:09:01Z

when you will merge it?

ccouton and others added 2 commits November 15, 2024 16:30

fix(good_job-1536): case of clock drift and delay close to 0.001

59b43aa

Refactor to pass state through create_task instead of via ivar

ba9c407

bensheldon changed the title ~~fix(good_job-1536): case of clock drift and delay close to 0.001~~ Fix cron double-enqueue because of clock drift and delay close to 0.001 Nov 15, 2024

Fix yard typo

248e60b

bensheldon changed the title ~~Fix cron double-enqueue because of clock drift and delay close to 0.001~~ Fix cron double-enqueue because of clock drift and delay close to 0.01 Nov 15, 2024

bensheldon changed the title ~~Fix cron double-enqueue because of clock drift and delay close to 0.01~~ Fix cron double-enqueue because delay close to 0.01 and possibly clock-drift Nov 15, 2024

When should have run in the past, run immediately

fdd0f4e

bensheldon added the bug Something isn't working label Nov 21, 2024

bensheldon merged commit c102011 into bensheldon:main Nov 21, 2024
25 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix cron double-enqueue because delay close to 0.01 and possibly clock-drift #1543

Fix cron double-enqueue because delay close to 0.01 and possibly clock-drift #1543

ccouton commented Nov 15, 2024 •

edited by bensheldon

Loading

ccouton commented Nov 15, 2024

bensheldon commented Nov 15, 2024

ccouton commented Nov 15, 2024

bensheldon commented Nov 15, 2024

ccouton commented Nov 15, 2024

bensheldon commented Nov 15, 2024 •

edited

Loading

ccouton commented Nov 15, 2024

ccouton commented Nov 15, 2024 •

edited

Loading

bensheldon commented Nov 16, 2024

ccouton commented Nov 18, 2024

ccouton commented Nov 20, 2024

Fix cron double-enqueue because delay close to 0.01 and possibly clock-drift #1543

Fix cron double-enqueue because delay close to 0.01 and possibly clock-drift #1543

Conversation

ccouton commented Nov 15, 2024 • edited by bensheldon Loading

ccouton commented Nov 15, 2024

bensheldon commented Nov 15, 2024

ccouton commented Nov 15, 2024

bensheldon commented Nov 15, 2024

ccouton commented Nov 15, 2024

bensheldon commented Nov 15, 2024 • edited Loading

ccouton commented Nov 15, 2024

ccouton commented Nov 15, 2024 • edited Loading

bensheldon commented Nov 16, 2024

ccouton commented Nov 18, 2024

ccouton commented Nov 20, 2024

ccouton commented Nov 15, 2024 •

edited by bensheldon

Loading

bensheldon commented Nov 15, 2024 •

edited

Loading

ccouton commented Nov 15, 2024 •

edited

Loading