Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RS-2297] Run dashboard-installer on operator update #3759

Closed

Conversation

Dean-Coakley
Copy link
Contributor

@Dean-Coakley Dean-Coakley commented Feb 5, 2025

Description

https://tigera.atlassian.net/browse/RS-2297 - Honeypods dashboard remains in Kibana on upgrade from v3.19

  • Make the operator generate a new job name per operator version. This will ensure a new dashboard-installer Job runs so that new Kibana dashboards are installed and deprecated ones are removed. Works in combination with https://github.com/tigera/calico-private/pull/8613

For PR author

  • Tests for change.
  • If changing pkg/apis/, run make gen-files
  • If changing versions, run make gen-versions

For PR reviewers

A note for code reviewers - all pull requests must have the following:

  • Milestone set according to targeted release.
  • Appropriate labels:
    • kind/bug if this is a bugfix.
    • kind/enhancement if this is a a new feature.
    • enterprise if this PR applies to Calico Enterprise only.

Copy link
Contributor

@gantony gantony left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -8,7 +8,7 @@
"spec": {
"order": 1,
"tier": "allow-tigera",
"selector": "job-name == 'dashboards-installer'",
"selector": "job-name contains 'dashboards-installer'",
Copy link
Member

@caseydavenport caseydavenport Feb 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should be using a different selector rather than the "contains" operator here. There's no reason we need to use the job-name label AFAIK - we could have role: dashboard-installer applied regardless of the actual name, for example.

My concern is just that contains is a pretty unusual operation, and given there's no concrete need for it here it makes this policy less intuitive.

@@ -49,6 +50,18 @@ var (
PolicyName = networkpolicy.TigeraComponentPolicyPrefix + Name
)

// GetJobName makes a unique job name per operator version.
// For dev images it takes the first 32 chars.
func GetJobName() string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're switching to this, we should stop exporting the Name variable so that nobody accidentally uses the wrong value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If Name isn't actually used except for within this function, we should probably just scope it to this function to be extra sure (i.e., remove the package level variable)

func GetJobName() string {
operatorVersion := strings.ReplaceAll(version.VERSION, ".", "-")

if len(operatorVersion) >= 32 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is "32" in this context? Why do we need to truncate? I think we should have a comment explaining why we do this - it's not obvious!

Copy link
Contributor Author

@Dean-Coakley Dean-Coakley Feb 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think without truncation the job name ended up being 80+ characters and I think Job names are limited to 63.

Good idea to explain in a comment.

Maybe I can also include an example in the comment, for example:
v1.36.10-1.dev-281-g9558bdac82be-2025-01-28-master-treachery is truncated to be v1.36.10-1.dev-281-g9558bdac82be

Copy link
Member

@rene-dekker rene-dekker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought changing the job would mean that the operator would delete the job object and recreate it, resulting in a new run. Am I missing something?
See

case *batchv1.Job:

I think we should find out why that logic was not sufficient in re-running the job instead.

@Dean-Coakley
Copy link
Contributor Author

Closing.

Opening: #3768 instead

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants