Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Tolerations to Build and BuildRun objects #1711

Merged
merged 3 commits into from
Feb 5, 2025

Conversation

dorzel
Copy link
Contributor

@dorzel dorzel commented Oct 30, 2024

Changes

Fixes #1636

Submitter Checklist

  • Includes tests if functionality changed/was added
  • Includes docs if changes are user-facing
  • Set a kind label on this PR
  • Release notes block has been filled in, or marked NONE

See the contributor guide
for details on coding conventions, github and prow interactions, and the code review process.

Release Notes

Add Tolerations to Build and BuildRun objects

@openshift-ci openshift-ci bot added the release-note Label for when a PR has specified a release note label Oct 30, 2024
@pull-request-size pull-request-size bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Oct 30, 2024
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 30, 2024
@dorzel dorzel force-pushed the MULTIARCH-5036 branch 8 times, most recently from 872db31 to 462d9bb Compare November 6, 2024 20:22
@dorzel dorzel force-pushed the MULTIARCH-5036 branch 3 times, most recently from 4ecfc21 to dfe25d5 Compare November 13, 2024 18:59
@dorzel dorzel force-pushed the MULTIARCH-5036 branch 3 times, most recently from e449fbd to 03b3b21 Compare November 19, 2024 18:13
@pull-request-size pull-request-size bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 19, 2024
@dorzel dorzel force-pushed the MULTIARCH-5036 branch 8 times, most recently from 3e66b55 to 43382a6 Compare November 20, 2024 22:06
@dorzel dorzel marked this pull request as ready for review November 21, 2024 17:28
@dorzel dorzel force-pushed the MULTIARCH-5036 branch 10 times, most recently from 090570c to b8e487a Compare January 29, 2025 23:05
Copy link
Member

@SaschaSchwarze0 SaschaSchwarze0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice that tests now succeed. I still have one more homework for you. Sorry that I missed that earlier.

Comment on lines 43 to 46
// In this case, fields set only on the BuildRun object do not get validated as they are not copied to the transient Build resource.
// explicitly setting them here is required for validation to happen.
build.Spec.NodeSelector = buildRun.Spec.NodeSelector
build.Spec.Tolerations = buildRun.Spec.Tolerations
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not correct and reminds me of something that I missed earlier (also for nodeSelector). Users can define these things both in the inline build spec as well as the overrides on the buildrun spec.

Two things:

  1. We should validate the buildrun spec fields somewhere near here (https://github.com/shipwright-io/build/blob/v0.14.0/pkg/reconciler/buildrun/buildrun.go#L276-L292) and not copy it over to a build spec to perform a validation.
  2. When an inline build spec is used, we validate that certain fields are not set on the buildrun spec because it does not make sense to have them specified in two places. That would be suitable for both nodeSelector and tolerations as well. The code that validates this is here: https://github.com/shipwright-io/build/blob/v0.14.0/pkg/validate/validate.go#L106-L136. Basically, when somebody creates a standalone BuildRun (= with an inline Build spec), we expect that BuildRun to not also set those fields (params, volumes, timeout etc) on the BuildRun spec itself.

Can you adopt this please.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so if I understand it seems that specifically in the inline build spec case we expect that certain fields are only specified on that build spec instead of having, for example, the "BuildRun Tolerations take preference over Build Tolerations" behavior (overriding). I'll implement these.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SaschaSchwarze0 Ok, let me know how that looks. Tests are good except for an unrelated flaky e2e test.

@dorzel dorzel force-pushed the MULTIARCH-5036 branch 2 times, most recently from fa80d86 to b0fee7d Compare February 3, 2025 22:06
@dorzel
Copy link
Contributor Author

dorzel commented Feb 4, 2025

/retest

Copy link
Contributor

openshift-ci bot commented Feb 4, 2025

@dorzel: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link
Member

@adambkaplan adambkaplan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

In general this looks really good! I found a few opportunities for improvement - either through refactoring or adding additional tests to improve code coverage. That said, I would not block merge/lgtm on these items, they can all be addressed in a follow-up pull request.

Comment on lines +32 to +58
for _, toleration := range b.Build.Spec.Tolerations {
// validate Key
if errs := validation.IsQualifiedName(toleration.Key); errs != nil {
b.Build.Status.Reason = ptr.To(build.TolerationNotValid)
b.Build.Status.Message = ptr.To(fmt.Sprintf("Toleration key not valid: %v", strings.Join(errs, ", ")))
}
// validate Operator
if !((toleration.Operator == v1.TolerationOpExists) || (toleration.Operator == v1.TolerationOpEqual)) {
b.Build.Status.Reason = ptr.To(build.TolerationNotValid)
b.Build.Status.Message = ptr.To(fmt.Sprintf("Toleration operator not valid. Must be one of: '%v', '%v'", v1.TolerationOpExists, v1.TolerationOpEqual))
}
// validate Value
if errs := validation.IsValidLabelValue(toleration.Value); errs != nil {
b.Build.Status.Reason = ptr.To(build.TolerationNotValid)
b.Build.Status.Message = ptr.To(fmt.Sprintf("Toleration value not valid: %v", strings.Join(errs, ", ")))
}
// validate Taint Effect, of which only "NoSchedule" is supported
if !((toleration.Effect) == "" || (toleration.Effect == v1.TaintEffectNoSchedule)) {
b.Build.Status.Reason = ptr.To(build.TolerationNotValid)
b.Build.Status.Message = ptr.To(fmt.Sprintf("Only the '%v' toleration effect is supported.", v1.TaintEffectNoSchedule))
}
// validate TolerationSeconds, which should not be specified
if toleration.TolerationSeconds != nil {
b.Build.Status.Reason = ptr.To(build.TolerationNotValid)
b.Build.Status.Message = ptr.To("Specifying TolerationSeconds is not supported.")
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we simplify this a bit and re-use the logic in BuildRunTolerations?

Suggested change
for _, toleration := range b.Build.Spec.Tolerations {
// validate Key
if errs := validation.IsQualifiedName(toleration.Key); errs != nil {
b.Build.Status.Reason = ptr.To(build.TolerationNotValid)
b.Build.Status.Message = ptr.To(fmt.Sprintf("Toleration key not valid: %v", strings.Join(errs, ", ")))
}
// validate Operator
if !((toleration.Operator == v1.TolerationOpExists) || (toleration.Operator == v1.TolerationOpEqual)) {
b.Build.Status.Reason = ptr.To(build.TolerationNotValid)
b.Build.Status.Message = ptr.To(fmt.Sprintf("Toleration operator not valid. Must be one of: '%v', '%v'", v1.TolerationOpExists, v1.TolerationOpEqual))
}
// validate Value
if errs := validation.IsValidLabelValue(toleration.Value); errs != nil {
b.Build.Status.Reason = ptr.To(build.TolerationNotValid)
b.Build.Status.Message = ptr.To(fmt.Sprintf("Toleration value not valid: %v", strings.Join(errs, ", ")))
}
// validate Taint Effect, of which only "NoSchedule" is supported
if !((toleration.Effect) == "" || (toleration.Effect == v1.TaintEffectNoSchedule)) {
b.Build.Status.Reason = ptr.To(build.TolerationNotValid)
b.Build.Status.Message = ptr.To(fmt.Sprintf("Only the '%v' toleration effect is supported.", v1.TaintEffectNoSchedule))
}
// validate TolerationSeconds, which should not be specified
if toleration.TolerationSeconds != nil {
b.Build.Status.Reason = ptr.To(build.TolerationNotValid)
b.Build.Status.Message = ptr.To("Specifying TolerationSeconds is not supported.")
}
}
ok, reason, msg := BuildRunTolerations(b.Build.Spec.Tolerations)
if !ok {
b.Build.Status.Reason = reason
b.Build.Status.Message = msg
}
return nil

@@ -15,6 +16,9 @@ import (

buildv1beta1 "github.com/shipwright-io/build/pkg/apis/build/v1beta1"
shpgit "github.com/shipwright-io/build/pkg/git"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"

corev1 "k8s.io/api/core/v1"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit (not blocking): keep k8s.io imports grouped together above.

@@ -75,7 +75,8 @@ The `BuildRun` definition supports the following fields:
- `spec.output.timestamp` - Overrides the output timestamp configuration of the referenced build to instruct the build to change the output image creation timestamp to the specified value. When omitted, the respective build strategy tool defines the output image timestamp.
- `spec.output.vulnerabilityScan` - Overrides the output vulnerabilityScan configuration of the referenced build to run the vulnerability scan for the generated image.
- `spec.env` - Specifies additional environment variables that should be passed to the build container. Overrides any environment variables that are specified in the `Build` resource. The available variables depend on the tool used by the chosen build strategy.
- `spec.nodeSelector` - Specifies a selector which must match a node's labels for the build pod to be scheduled on that node.
- `spec.nodeSelector` - Specifies a selector which must match a node's labels for the build pod to be scheduled on that node. If nodeSelectors are specified in both a `Build` and `BuildRun`, `BuildRun` values take precedence.
- `spec.tolerations` - Specifies the tolerations for the build pod. Only `key`, `value`, and `operator` are supported. Only `NoSchedule` taint `effect` is supported. If tolerations are specified in both a `Build` and `BuildRun`, `BuildRun` values take precedence.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 on the reference doc!

}
}

return nil
}

// BuildRunNodeSelector is used to validate nodeSelectors in the BuildRun object
func BuildRunNodeSelector(nodeSelector map[string]string) (bool, string, string) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto here - there is a refactoring opportunity to consolidate the logic into a single function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not blocking - if possible, I'd love to have unit-style Ginkgo tests that cover all the possible validation exceptions.

This can be done in a follow-up pull request, as we currently don't have similar levels of unit test coverage for the node selector validations. We have test coverage elsewhere in this PR.

Comment on lines +141 to +149
if len(buildRun.Spec.NodeSelector) > 0 {
return resources.BuildRunBuildFieldOverrideForbidden,
"cannot use 'nodeSelector' override and 'buildSpec' simultaneously"
}

if len(buildRun.Spec.Tolerations) > 0 {
return resources.BuildRunBuildFieldOverrideForbidden,
"cannot use 'tolerations' override and 'buildSpec' simultaneously"
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(not-blocking): would like to have unit test coverage here.

Copy link
Contributor

openshift-ci bot commented Feb 4, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: adambkaplan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 4, 2025
@dorzel
Copy link
Contributor Author

dorzel commented Feb 5, 2025

@adambkaplan @SaschaSchwarze0 Is this ok for merge? I'll plan on addressing the comments above in #1770

Copy link
Member

@SaschaSchwarze0 SaschaSchwarze0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Feb 5, 2025
@openshift-merge-bot openshift-merge-bot bot merged commit 4a600ad into shipwright-io:main Feb 5, 2025
19 checks passed
@dorzel dorzel deleted the MULTIARCH-5036 branch February 5, 2025 19:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/feature Categorizes issue or PR as related to a new feature. lgtm Indicates that a PR is ready to be merged. release-note Label for when a PR has specified a release note size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

SHIP-0039: Allow tolerations on Build and BuildRun to be set
3 participants