`RunTask` does not properly remove / update the executed task #51

acoulton · 2021-08-30T14:09:49Z

Per documentation for RunTask
https://cloud.google.com/tasks/docs/reference/rpc/google.cloud.tasks.v2#google.cloud.tasks.v2.CloudTasks.RunTask

If Cloud Tasks receives a successful response from the task's target, then the task will be deleted; otherwise the task's
schedule_time will be reset to the time that RunTask was called plus the retry delay specified in the queue's RetryConfig.

In other words, it essentially internally works the same as if you'd set ScheduleTime to now (albeit that it also bypasses rate limits / queue pause state / etc).

However the emulator is treating RunTask as a separate execution - the task will still run at the original scheduled time. Unfortunately, it is immediately removed from ListTasks / GetTasks / etc, so it's not possible to see this is happening or cancel
the task manually. This causes odd race conditions in testing (especially if coupled with delayed schedule times) as the queue looks empty but you get an unexpected duplicate delivery some time later.

acoulton · 2021-08-30T14:57:47Z

@aertje this is the last thing on my list for the moment, which we found after tracing some odd race conditions in our end-to-end test suites (our app code queues deferred tasks, then we had our tests calling RunTasks to allow us to move on to assertions, but that was causing a duplicate execution a while later during an unrelated test).

Obviously in practice this really only happens coupled with a delayed ScheduleTime because in other cases the task fires faster than you can call RunTask anyway. In theory it could also happen if you were using RunTask to force a task that was delayed by queue rate limits, but I suspect that's unlikely when using the emulator.

As with the others this is based on the earlier work so the diff for this branch alone is https://github.com/aertje/cloud-tasks-emulator/pull/51/files/0519806882ebabbd9e7da18fedb5862f20863ad0..9762c38728f6fd8679e207854e984f58c2c7c64d and I'll update / rebase as required.

Per documentation for RunTask https://cloud.google.com/tasks/docs/reference/rpc/google.cloud.tasks.v2#google.cloud.tasks.v2.CloudTasks.RunTask > If Cloud Tasks receives a successful response from the task's > target, then the task will be deleted; otherwise the task's > schedule_time will be reset to the time that RunTask was called > plus the retry delay specified in the queue's RetryConfig. In other words, it essentially internally works the same as if you'd set `ScheduleTime` to `now` (albeit that it also bypasses rate limits / queue pause state / etc). However the emulator is treating RunTask as a separate execution - the task will still run at the original scheduled time, albeit it immediately disappears from ListTasks / GetTasks. And additionally, the emulator is not retrying if a task triggered by RunTasks fails.

Makes the behaviour work the same as in production. I considered writing to the `cancel` channel instead of the boolean flag, but that would have triggered a second `task.onDone()` call which might have produced unexpected side-effects.

This was previously only used to trigger the no-retry behaviour for `RunTask`, and is therefore now always true. Tasks will always be considered for retry after a failure response. They are only not retried if the queue retry config prevents it e.g. if they have reached the max retries.

acoulton · 2021-09-13T08:45:36Z

@aertje and this is also now good to go, there's no dependency / conflict with #50 so both should be mergeable.

aertje

Hi @acoulton; I'm not quite sure how this resolves the underlying problem, apart from hiding the symptoms a bit more. Isn't a lock around the whole of Run and Attempt needed to ensure only the scheduled or runtask run is allowed to execute?

aertje · 2021-10-10T04:21:31Z

task.go

+	task.stateMutex.Lock()
+	defer task.stateMutex.Unlock()
+	task.isDone = true
+	task.onDone(task)


Why does this need to be included inside the lock?

aertje · 2021-10-10T04:52:56Z

task.go

-			task.queue.fire <- task
+			// It's possible the task might already be `done` if e.g. it was called by RunTask before the originally
+			// scheduled execution time. In that case this is a no-op, otherwise, run the task.
+			if !task.isDone {


I don't quite get how this check avoids that problem - what if runtask is called at exactly this point?

acoulton · 2021-10-10T07:43:38Z

Hi @acoulton; I'm not quite sure how this resolves the underlying problem, apart from hiding the symptoms a bit more. Isn't a lock around the whole of Run and Attempt needed to ensure only the scheduled or runtask run is allowed to execute?

That's a fair question, however I think that the production behaviour if the calls are actually concurrent is undefined / at-least-once. The docs say (my emphasis):

Forces a task to run now.
When this method is called, Cloud Tasks will dispatch the task, even if the task is already running, the queue has reached its RateLimits or is PAUSED.

The docs don't specify what result is stored if there are concurrent executions, my hunch is it will be removed if either is successful, or the last to complete if failure.

Equally, the major issue at the moment is that tasks run several minutes later (despite not being visible in ListTasks). This PR reliably solves that AFAIK.

There is still a small risk of duplicate execution at the exact time of the RunTask call, but IMO that is probably acceptable, since CloudTasks doesn't guarantee exactly-once delivery anyway so apps should already be able to cope with a duplicate despatch. The issue we have in CI is that by the time the originally scheduled task fires our suite has moved on so the database is no longer in valid state to handle that duplicate.

Let me see if I can do a bit more testing of exactly what production does for concurrent cases, and then I'll come back to you.

acoulton marked this pull request as ready for review August 30, 2021 14:51

acoulton added 3 commits September 13, 2021 09:39

acoulton force-pushed the bug-run-task-not-clearing-tasks branch from 9762c38 to 0453b91 Compare September 13, 2021 08:42

aertje reviewed Oct 10, 2021

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`RunTask` does not properly remove / update the executed task #51

`RunTask` does not properly remove / update the executed task #51

acoulton commented Aug 30, 2021

acoulton commented Aug 30, 2021

acoulton commented Sep 13, 2021

aertje left a comment

aertje Oct 10, 2021

aertje Oct 10, 2021

acoulton commented Oct 10, 2021

RunTask does not properly remove / update the executed task #51

Are you sure you want to change the base?

RunTask does not properly remove / update the executed task #51

Conversation

acoulton commented Aug 30, 2021

acoulton commented Aug 30, 2021

acoulton commented Sep 13, 2021

aertje left a comment

Choose a reason for hiding this comment

aertje Oct 10, 2021

Choose a reason for hiding this comment

aertje Oct 10, 2021

Choose a reason for hiding this comment

acoulton commented Oct 10, 2021

`RunTask` does not properly remove / update the executed task #51

`RunTask` does not properly remove / update the executed task #51