Skip to content

Commit

Permalink
adjust grafana alert panels to match where o11y alerts landed
Browse files Browse the repository at this point in the history
During the review of the alert rule definitions in redhat-appstudio/o11y we learned that
'increase' is a bit better for us vs. 'sum_over_time' because it better deals with metric controller restarts.  Adjusting our appstudio panels accordingly.  Live comparisons on the RHTAP clusters have proven favorable.
  • Loading branch information
gabemontero committed Nov 16, 2023
1 parent 27d7ec1 commit c36e75c
Showing 1 changed file with 2 additions and 2 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -146,7 +146,7 @@
{
"editorMode": "code",
"exemplar": true,
"expr": "(sum(sum_over_time(pipelinerun_duration_scheduled_seconds_sum{status='succeded'}[30m])) / sum(sum_over_time(pipelinerun_duration_scheduled_seconds_count{status='succeded'}[30m]))) / (sum(sum_over_time(tekton_pipelines_controller_pipelinerun_duration_seconds_sum{status='success'}[30m])) / sum(sum_over_time(tekton_pipelines_controller_pipelinerun_duration_seconds_count{status='success'}[30m])))",
"expr": "(sum(increase(pipelinerun_duration_scheduled_seconds_sum{status='succeded'}[30m])) / sum(increase(pipelinerun_duration_scheduled_seconds_count{status='succeded'}[30m]))) / (sum(increase(tekton_pipelines_controller_pipelinerun_duration_seconds_sum{status='success'}[30m])) / sum(increase(tekton_pipelines_controller_pipelinerun_duration_seconds_count{status='success'}[30m])))",
"format": "table",
"hide": false,
"instant": false,
Expand Down Expand Up @@ -212,7 +212,7 @@
"targets": [
{
"editorMode": "code",
"expr": "(sum(sum_over_time(pipelinerun_gap_between_taskruns_milliseconds_sum{status='succeded'}[30m])/1000) / sum(sum_over_time(pipelinerun_gap_between_taskruns_milliseconds_count{status='succeded'}[30m]))) / (sum(sum_over_time(tekton_pipelines_controller_pipelinerun_duration_seconds_sum{status='success'}[30m])) / sum(sum_over_time(tekton_pipelines_controller_pipelinerun_duration_seconds_count{status='success'}[30m])))",
"expr": "(sum(increase(pipelinerun_gap_between_taskruns_milliseconds_sum{status='succeded'}[30m])/1000) / sum(increase(pipelinerun_gap_between_taskruns_milliseconds_count{status='succeded'}[30m]))) / (sum(increase(tekton_pipelines_controller_pipelinerun_duration_seconds_sum{status='success'}[30m])) / sum(increase(tekton_pipelines_controller_pipelinerun_duration_seconds_count{status='success'}[30m])))",
"legendFormat": "__auto",
"range": true,
"refId": "A"
Expand Down

0 comments on commit c36e75c

Please sign in to comment.