Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: new template counting and phase counting metrics #13275

Merged
merged 2 commits into from
Aug 23, 2024

Conversation

Joibel
Copy link
Member

@Joibel Joibel commented Jun 28, 2024

From #12589.

New metric total_count which is like the old count metric and the new gauge metric, but a counter, not a gauge. The gauge shows a snapshot of what is happening right now in the cluster, the counter can answer questions like how many Failed workflows have there been in the last 24 hours.

Two further metrics for counting uses of WorkflowTemplates via workflowTemplateRef only. These store the name of the WorkflowTemplate or ClusterWorkflowTemplate if the cluster_scope label is true, and the namespace where it is used. workflowtemplate_triggered_total counts the number of uses. workflowtemplate_runtime records how long
each phase the workflow running the template spent in seconds.

Note to reviewers: this is now a standlone commit

@agilgur5
Copy link

Note to reviewers: this is part of a stack of reviews for metrics changes. Please don't merge until the rest of the stack is also ready.

@Joibel since these ones aren't draft, does that mean they're ready to review but not ready to merge?

What we can do to indicate that is to add a "[DO NOT MERGE] " prefix to the title of each of these -- since that will fail the PR title check, it won't be mergeable and is a pretty big indicator too. Or "[Please Review but DO NOT MERGE] " or something

@Joibel
Copy link
Member Author

Joibel commented Jun 30, 2024

These are ready for review.

Merging isn't a disaster, it just makes any merged into PRs slightly harder to review. I was hoping anyone reviewing would do enough reading not to merge with the existing words I've written.

The whole thing is an experiment in highly stacked PRs from my point of view.

@Joibel Joibel force-pushed the opentelemetry-crontrigger branch from c5d2f77 to dc7479a Compare July 1, 2024 07:42
@Joibel Joibel force-pushed the opentelemetry-templatecount branch from 5fecb02 to 1d73fc2 Compare July 1, 2024 07:42
@Joibel Joibel added area/controller Controller issues, panics area/metrics labels Jul 1, 2024
@agilgur5
Copy link

agilgur5 commented Jul 1, 2024

I was hoping anyone reviewing would do enough reading not to merge with the existing words I've written.

Unfortunately, I'm not sure that's always the case 😕 But more importantly, there's also the muscle memory of the "Approve" -> "Merge" process that makes it more likely to happen unintentionally. As such, I'm gonna mark them as "[DNM]" and maybe put an emoji indicating reviewed or not for at a glance filtering. Also "[1/12]" would work too as a blocking prefix

@agilgur5 agilgur5 changed the title feat: new template counting and phase counting metrics 🪧 [DNM] feat: new template counting and phase counting metrics Jul 1, 2024
@Joibel Joibel force-pushed the opentelemetry-crontrigger branch from dc7479a to 38143f4 Compare July 5, 2024 10:28
@Joibel Joibel force-pushed the opentelemetry-templatecount branch from 1d73fc2 to 69d8583 Compare July 5, 2024 10:29
@Joibel Joibel force-pushed the opentelemetry-crontrigger branch from 38143f4 to 481c236 Compare July 8, 2024 10:07
@Joibel Joibel force-pushed the opentelemetry-templatecount branch from 69d8583 to ec5cc91 Compare July 8, 2024 10:08
@Joibel Joibel force-pushed the opentelemetry-crontrigger branch from 481c236 to 18bb3a9 Compare July 9, 2024 14:27
@Joibel Joibel force-pushed the opentelemetry-templatecount branch from ec5cc91 to f24afc8 Compare July 9, 2024 14:28
@Joibel Joibel force-pushed the opentelemetry-crontrigger branch from 18bb3a9 to 9def3b4 Compare July 12, 2024 08:49
@Joibel Joibel force-pushed the opentelemetry-templatecount branch from f24afc8 to b959898 Compare July 12, 2024 08:49
@Joibel Joibel force-pushed the opentelemetry-crontrigger branch from 9def3b4 to aaf38bd Compare August 12, 2024 11:13
@Joibel Joibel force-pushed the opentelemetry-templatecount branch from b959898 to 27b6d50 Compare August 12, 2024 11:13
@Joibel Joibel force-pushed the opentelemetry-crontrigger branch from aaf38bd to 98775d8 Compare August 12, 2024 11:20
@Joibel Joibel force-pushed the opentelemetry-templatecount branch from 27b6d50 to c7a6897 Compare August 12, 2024 11:20
@Joibel Joibel force-pushed the opentelemetry-crontrigger branch from 98775d8 to a43524a Compare August 14, 2024 10:42
@Joibel Joibel force-pushed the opentelemetry-templatecount branch from c7a6897 to f351e1b Compare August 14, 2024 10:42
@Joibel Joibel force-pushed the opentelemetry-crontrigger branch from a43524a to 45303a9 Compare August 15, 2024 08:04
@Joibel Joibel force-pushed the opentelemetry-templatecount branch from f351e1b to 02f7860 Compare August 15, 2024 08:04
@Joibel Joibel force-pushed the opentelemetry-crontrigger branch from 45303a9 to e2eecd2 Compare August 15, 2024 14:07
@Joibel Joibel force-pushed the opentelemetry-templatecount branch 2 times, most recently from 58498ce to b10d732 Compare August 16, 2024 09:09
@Joibel Joibel force-pushed the opentelemetry-crontrigger branch 2 times, most recently from 8f68609 to 075816f Compare August 16, 2024 09:28
@Joibel Joibel changed the title 🪧 [DNM] feat: new template counting and phase counting metrics feat: new template counting and phase counting metrics Aug 16, 2024
@Joibel Joibel force-pushed the opentelemetry-templatecount branch from b10d732 to e12e15c Compare August 16, 2024 09:29
@Joibel Joibel changed the base branch from opentelemetry-crontrigger to main August 16, 2024 09:31
Copy link
Member

@isubasinghe isubasinghe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a small comment

workflow/metrics/counter_template.go Outdated Show resolved Hide resolved
@Joibel Joibel force-pushed the opentelemetry-templatecount branch 3 times, most recently from 4cca2a8 to 44cb0d6 Compare August 19, 2024 10:59
From #12589.

New metric `total_count` which is like the old `count` metric and the
new `gauge` metric, but a counter, not a gauge. The gauge shows a
snapshot of what is happening right now in the cluster, the counter
can answer questions like how many `Failed` workflows have there been
in the last 24 hours.

Two further metrics for counting uses of WorkflowTemplates via
workflowTemplateRef only. These store the name of the WorkflowTemplate
or ClusterWorkflowTemplate if the `cluster_scope` label is true, and
the namespace where it is used. `workflowtemplate_triggered_total`
counts the number of uses. `workflowtemplate_runtime` records how long
each phase the workflow running the template spent in seconds.

Note to reviewers: this is part of a stack of reviews for metrics
changes. Please don't merge until the rest of the stack is also ready.

Signed-off-by: Alan Clucas <[email protected]>
@Joibel Joibel force-pushed the opentelemetry-templatecount branch from 44cb0d6 to 15fabdb Compare August 19, 2024 11:18
@Joibel Joibel force-pushed the opentelemetry-templatecount branch from 15fabdb to 5175e6d Compare August 19, 2024 14:01
@isubasinghe isubasinghe self-requested a review August 22, 2024 23:07
@isubasinghe isubasinghe enabled auto-merge (squash) August 22, 2024 23:07
@isubasinghe isubasinghe disabled auto-merge August 22, 2024 23:07
@Joibel Joibel merged commit b8bf821 into main Aug 23, 2024
28 checks passed
@Joibel Joibel deleted the opentelemetry-templatecount branch August 23, 2024 07:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/controller Controller issues, panics area/metrics
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants