Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Scheduled Actions V2] WIP: top-level Scheduler state machine #6904

Draft
wants to merge 1 commit into
base: sched2_common
Choose a base branch
from

Conversation

lina-temporal
Copy link
Contributor

Based on #6903

What changed?

  • Added the WIP top-level Scheduler state machine

Why?

  • Explained in the code :)

How did you test it?

  • I didn't, needs testing (draft PR)

Potential risks

Documentation

Is hotfix candidate?

)

type (
// The top-level scheduler state machine is compromised of 3 substate machines:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
// The top-level scheduler state machine is compromised of 3 substate machines:
// The top-level scheduler state machine is compromised of 3 sub state machines:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will fix here/elsewhere

// - Executor: executes buffered actions
// - Backfiller: buffers actions according to requested backfills
//
// A running scheduler will always have exactly one of each of the above substate
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: is this space intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope, will fix that.

}
s2, ok := a.(Scheduler)
if !ok {
return 0, fmt.Errorf("%w: expected state1 to be a Scheduler instance, got %v", hsm.ErrIncompatibleType, s2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return 0, fmt.Errorf("%w: expected state1 to be a Scheduler instance, got %v", hsm.ErrIncompatibleType, s2)
return 0, fmt.Errorf("%w: expected state2 to be a Scheduler instance, got %v", hsm.ErrIncompatibleType, s2)

Comment on lines +107 to +111
if s1.State() > s2.State() {
return 1, nil
} else if s1.State() < s2.State() {
return -1, nil
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is used to determine whether to sync a state machine across clusters in the soon-to-be-deprecated replication stack.
The way you've implemented this will prevent updates to the scheduler from being synced. You may be able to use the conflict token instead but that has some unwanted implications when there's a split brain.

Hopefully state based replication will be ready before the scheduler work and you won't have to implement this at all.
For now, I'd start with an implementation that panics and put a TODO here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, I'd start with an implementation that panics and put a TODO here.

Yep, will do.

//
// When decrement is true, the schedule's state's `RemainingActions` counter
// is decremented and the conflict token is bumped.
func (s Scheduler) CanTakeScheduledAction(decrement bool) bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rename this to UseScheduledAction instead of having a side effect in a method named "Can...".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 will rename

if s.Schedule.State.RemainingActions > 0 {
if decrement {
s.Schedule.State.RemainingActions--
s.ConflictToken++
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to increment the conflict token here, it can fail a user update request for no good reason.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, will remove.


func (s Scheduler) CompiledSpec(specBuilder *scheduler.SpecBuilder) (*scheduler.CompiledSpec, error) {
// cache compiled spec
if s.compiledSpec == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the conflict token number to know if your cache may be invalid instead of worrying about manual invalidation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't for manual invalidation, it's to support lazily loading the compiled spec after deserializing the state machine. Ideally, I'd fill this in at Deserialize time, but that would require I have access to the SpecBuilder (which itself caches time zone files from disk).. I suppose I could wire in SpecBuilder onto the machineDefinition, but I think I remember you mentioning that CHASM automates the Deserialize/Serialize bits away from components, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the spec is updated later, don't you need to recompile it?

CHASM automates the Deserialize/Serialize bits away from components

Yes, which means you're right about not wanting to put this in deserialize, but also for the reason I gave above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the spec is updated later, don't you need to recompile it?

Yeah, you do; I see your point, will update.

@@ -0,0 +1,171 @@
package core
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't create this extra package, just put everything in components/scheduler, it'll help you avoid circular dependencies later and allow you to not expose anything that's not strictly necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason it's separated out is because the substate machines would have a lot of conflicts both between boilerplate (generator.MachineKey/backfiller.MachineKey -> GeneratorMachineKey/BackfillerMachineKey) as well as conflict between states/events (generator.TransitionSleep/executor.TransitionSleep -> TransitionGeneratorSleep/TransitionExecutorSleep). It also helps avoid creating circular dependencies and tight coupling between components.

I can be convinced the other way if it's going to be the standard for HSM projects, to keep all within a single package, but if there is a component that were to be fitting for multiple component packages, I think it'd be this one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... will leave it up to you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants