Skip to content

Commit

Permalink
feat: new k8s request duration histogram metric
Browse files Browse the repository at this point in the history
This is a new metric which is a superset of the k8s_request_total
metric. The old metric is retained for backwards compatibility.

A common issue with workflows is that it stresses the kubernetes
API. Using this metric you can get insights into how long each type of
request is taking which should enable optimizations to help alleviate
problems in this area.

The new metric has the same attributes/labels of `kind`, `verb` and
`status_code` but now emits them with a time duration for each type of
response.

Note to reviewers: this is part of a stack of reviews for metrics
changes. Please don't merge until the rest of the stack is also ready.

Signed-off-by: Alan Clucas <[email protected]>
  • Loading branch information
Joibel committed Jun 28, 2024
1 parent 8cdc6d9 commit 6084e25
Show file tree
Hide file tree
Showing 3 changed files with 28 additions and 1 deletion.
14 changes: 14 additions & 0 deletions docs/metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -247,6 +247,20 @@ A counter of the number of API requests sent to the Kubernetes API.
| `verb` | The verb of the request, such as `Get` or `List` |
| `status_code` | The HTTP status code of the response |

This metric is calculable from `k8s_request_duration`, and it is suggested you just collect that metric instead.

#### `k8s_request_duration`

A histogram recording how long each type of request took.

| attribute | explanation |
|---------------|--------------------------------------------------------------------|
| `kind` | The kubernetes `kind` involved in the request such as `configmaps` |
| `verb` | The verb of the request, such as `Get` or `List` |
| `status_code` | The HTTP status code of the response |

This is contains all the information contained in `k8s_request_total` along with timings.

#### `leader`

This gauge indicates if this workflow controller the leader in a leader elected controller setup, or is otherwise
Expand Down
1 change: 1 addition & 0 deletions docs/upgrading.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ These notes explain the differences in using the Prometheus `/metrics` endpoint
The following are new metrics:

* `controller_build_info`
* `k8s_request_duration`
* `leader`
* `queue_duration`
* `queue_longest_running`
Expand Down
14 changes: 13 additions & 1 deletion workflow/metrics/metrics_k8s_request.go
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,16 @@ package metrics
import (
"context"
"net/http"
"time"

"k8s.io/client-go/rest"

"github.com/argoproj/argo-workflows/v3/util/k8s"
)

const (
nameK8sRequestTotal = `k8s_request_total`
nameK8sRequestTotal = `k8s_request_total`
nameK8sRequestDuration = `k8s_request_duration`
)

func addK8sRequests(_ context.Context, m *Metrics) error {
Expand All @@ -23,6 +25,13 @@ func addK8sRequests(_ context.Context, m *Metrics) error {
if err != nil {
return err
}
err = m.createInstrument(float64Histogram,
nameK8sRequestDuration,
"Duration of kubernetes requests executed.",
"s",
withDefaultBuckets([]float64{0.1, 0.2, 0.5, 1.0, 2.0, 5.0, 10.0, 20.0, 60.0, 180.0}),
withAsBuiltIn(),
)
// Register this metrics with the global
k8sMetrics.metrics = m
return err
Expand All @@ -39,7 +48,9 @@ type metricsRoundTripper struct {
var k8sMetrics metricsRoundTripper

func (m metricsRoundTripper) RoundTrip(r *http.Request) (*http.Response, error) {
startTime := time.Now()
x, err := m.roundTripper.RoundTrip(r)
duration := time.Since(startTime)
if x != nil && m.metrics != nil {
verb, kind := k8s.ParseRequest(r)
attribs := instAttribs{
Expand All @@ -48,6 +59,7 @@ func (m metricsRoundTripper) RoundTrip(r *http.Request) (*http.Response, error)
{name: labelRequestCode, value: x.StatusCode},
}
(*m.metrics).addInt(m.ctx, nameK8sRequestTotal, 1, attribs)
(*m.metrics).record(m.ctx, nameK8sRequestDuration, duration.Seconds(), attribs)
}
return x, err
}
Expand Down

0 comments on commit 6084e25

Please sign in to comment.