-
Notifications
You must be signed in to change notification settings - Fork 191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add nonsigning rate threshold control to ejector #601
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
@@ -146,6 +146,13 @@ var ( | |||
Value: 6 * time.Minute, | |||
EnvVar: common.PrefixEnvVar(envVarPrefix, "TRANSACTION_TIMEOUT"), | |||
} | |||
NonsigningRateThresholdFlag = cli.IntFlag{ | |||
Name: common.PrefixFlag(FlagPrefix, "nonsigning-rate-threshold"), | |||
Usage: "only operators with nonsigning rate >= this threshold are eligible for ejection, this value must be in range [10, 100], any value not in this range means disabling this flag", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
disabling the flag means, eject based on regular SLA?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, no additional conditions to affect the standard SLA and ejection flow.
@@ -82,6 +84,11 @@ func (e *Ejector) Eject(ctx context.Context, nonsigningRate *OperatorsNonsigning | |||
|
|||
nonsigners := make([]*OperatorNonsigningPercentageMetrics, 0) | |||
for _, metric := range nonsigningRate.Data { | |||
// If nonsigningRateThreshold is set and valid, we will only eject operators with | |||
// nonsigning rate >= nonsigningRateThreshold. | |||
if e.nonsigningRateThreshold >= 10 && e.nonsigningRateThreshold <= 100 && metric.Percentage < float64(e.nonsigningRateThreshold) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if e.nonsigningRateThreshold is 9, then regular SLA is enforced.
then if e.nonsigningRateThreshold is 50, then we enable the relaxed ejection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In first case: e.nonsigningRateThreshold will have no effect, it will just perform regular ejection
In second case: it'll eject an operator if 1) the operator violates SLA; AND 2) the operator's nonsigning rate >= 50%
Note: when it's regular ejection, only condition 1) is used
Why are these changes needed?
Allow gradual rollout of ejection.
Tested: locally
Checks