Skip to content

Latest commit

 

History

History
467 lines (414 loc) · 49.6 KB

kubernetes-apiserver.md

File metadata and controls

467 lines (414 loc) · 49.6 KB

kubernetes-apiserver

Monitor Type: kubernetes-apiserver (Source)

Accepts Endpoints: Yes

Multiple Instances Allowed: Yes

Overview

This monitor queries the Kubernetes API server for kube-apiserver metrics in Prometheus format. The monitor queries path /metrics by default when no path is configured. The monitor converts the Prometheus metric types to SignalFx metric types as described here

Example YAML Configuration

monitors:
- type: kubernetes-apiserver
  discoveryRule: Get(container_labels, "component") == "kube-apiserver"
  extraDimensions:
    metric_source: kubernetes-apiserver

Configuration

To activate this monitor in the Smart Agent, add the following to your agent config:

monitors:  # All monitor config goes under this key
 - type: kubernetes-apiserver
   ...  # Additional config

For a list of monitor options that are common to all monitors, see Common Configuration.

Config option Required Type Description
httpTimeout no int64 HTTP timeout duration for both read and writes. This should be a duration string that is accepted by https://golang.org/pkg/time/#ParseDuration (default: 10s)
username no string Basic Auth username to use on each request, if any.
password no string Basic Auth password to use on each request, if any.
useHTTPS no bool If true, the agent will connect to the server using HTTPS instead of plain HTTP. (default: false)
httpHeaders no map of strings A map of HTTP header names to values. Comma separated multiple values for the same message-header is supported.
skipVerify no bool If useHTTPS is true and this option is also true, the exporter's TLS cert will not be verified. (default: false)
caCertPath no string Path to the CA cert that has signed the TLS cert, unnecessary if skipVerify is set to false.
clientCertPath no string Path to the client TLS cert to use for TLS required connections
clientKeyPath no string Path to the client TLS key to use for TLS required connections
host yes string Host of the exporter
port yes integer Port of the exporter
useServiceAccount no bool Use pod service account to authenticate. (default: false)
metricPath no string Path to the metrics endpoint on the exporter server, usually /metrics (the default). (default: /metrics)
sendAllMetrics no bool Send all the metrics that come out of the Prometheus exporter without any filtering. This option has no effect when using the prometheus exporter monitor directly since there is no built-in filtering, only when embedding it in other monitors. (default: false)

Metrics

These are the metrics available for this monitor. Metrics that are categorized as container/host (default) are in bold and italics in the list below.

  • apiserver_current_inflight_requests (gauge)
    Maximal number of currently used inflight request limit of this apiserver per request kind in last second.
  • apiserver_init_events_total (cumulative)
    Counter of init events processed in watchcache broken by resource type
  • apiserver_longrunning_gauge (gauge)
    Gauge of all active long-running apiserver requests broken out by verb, group, version, resource, scope and component. Not all requests are tracked this way.
  • apiserver_registered_watchers (gauge)
    Number of currently registered watchers for a given resources
  • authenticated_user_requests (cumulative)
    Counter of authenticated requests broken out by username.
  • kubernetes_build_info (gauge)
    A metric with a constant '1' value labeled by major, minor, git version, git commit, git tree state, build date, Go version, and compiler from which Kubernetes was built, and platform on which it is running.

Group admission_quota_controller

All of the following metrics are part of the admission_quota_controller metric group. All of the non-default metrics below can be turned on by adding admission_quota_controller to the monitor config option extraGroups:

  • admission_quota_controller_adds (cumulative)
    (Deprecated) Total number of adds handled by workqueue: admission_quota_controller
  • admission_quota_controller_depth (gauge)
    (Deprecated) Current depth of workqueue: admission_quota_controller
  • admission_quota_controller_longest_running_processor_microseconds (gauge)
    (Deprecated) How many microseconds has the longest running processor for admission_quota_controller been running.
  • admission_quota_controller_queue_latency (cumulative)
    (Deprecated) How long an item stays in workqueueadmission_quota_controller before being requested. (sum)
  • admission_quota_controller_queue_latency_count (cumulative)
    (Deprecated) How long an item stays in workqueueadmission_quota_controller before being requested. (count)
  • admission_quota_controller_queue_latency_quantile (gauge)
    (Deprecated) How long an item stays in workqueueadmission_quota_controller before being requested. (quantized)
  • admission_quota_controller_unfinished_work_seconds (gauge)
    (Deprecated) How many seconds of work admission_quota_controller has done that is in progress and hasn't been observed by work_duration. Large values indicate stuck threads. One can deduce the number of stuck threads by observing the rate at which this increases.
  • admission_quota_controller_work_duration (cumulative)
    (Deprecated) How long processing an item from workqueueadmission_quota_controller takes. (sum)
  • admission_quota_controller_work_duration_count (cumulative)
    (Deprecated) How long processing an item from workqueueadmission_quota_controller takes. (count)
  • admission_quota_controller_work_duration_quantile (gauge)
    (Deprecated) How long processing an item from workqueueadmission_quota_controller takes. (quantized)

Group api_service_open_api_aggregation_controller

All of the following metrics are part of the api_service_open_api_aggregation_controller metric group. All of the non-default metrics below can be turned on by adding api_service_open_api_aggregation_controller to the monitor config option extraGroups:

  • APIServiceOpenAPIAggregationControllerQueue1_adds (cumulative)
    (Deprecated) Total number of adds handled by workqueue: APIServiceOpenAPIAggregationControllerQueue1
  • APIServiceOpenAPIAggregationControllerQueue1_depth (gauge)
    (Deprecated) Current depth of workqueue: APIServiceOpenAPIAggregationControllerQueue1
  • APIServiceOpenAPIAggregationControllerQueue1_longest_running_processor_microseconds (gauge)
    (Deprecated) How many microseconds has the longest running processor for APIServiceOpenAPIAggregationControllerQueue1 been running.
  • APIServiceOpenAPIAggregationControllerQueue1_queue_latency (cumulative)
    (Deprecated) How long an item stays in workqueueAPIServiceOpenAPIAggregationControllerQueue1 before being requested. (sum)
  • APIServiceOpenAPIAggregationControllerQueue1_queue_latency_count (cumulative)
    (Deprecated) How long an item stays in workqueueAPIServiceOpenAPIAggregationControllerQueue1 before being requested. (count)
  • APIServiceOpenAPIAggregationControllerQueue1_queue_latency_quantile (gauge)
    (Deprecated) How long an item stays in workqueueAPIServiceOpenAPIAggregationControllerQueue1 before being requested. (quantized)
  • APIServiceOpenAPIAggregationControllerQueue1_retries (cumulative)
    (Deprecated) Total number of retries handled by workqueue: APIServiceOpenAPIAggregationControllerQueue1
  • APIServiceOpenAPIAggregationControllerQueue1_unfinished_work_seconds (gauge)
    (Deprecated) How many seconds of work APIServiceOpenAPIAggregationControllerQueue1 has done that is in progress and hasn't been observed by work_duration. Large values indicate stuck threads. One can deduce the number of stuck threads by observing the rate at which this increases.
  • APIServiceOpenAPIAggregationControllerQueue1_work_duration (cumulative)
    (Deprecated) How long processing an item from workqueueAPIServiceOpenAPIAggregationControllerQueue1 takes. (sum)
  • APIServiceOpenAPIAggregationControllerQueue1_work_duration_count (cumulative)
    (Deprecated) How long processing an item from workqueueAPIServiceOpenAPIAggregationControllerQueue1 takes. (count)
  • APIServiceOpenAPIAggregationControllerQueue1_work_duration_quantile (gauge)
    (Deprecated) How long processing an item from workqueueAPIServiceOpenAPIAggregationControllerQueue1 takes. (quantized)

Group api_service_registration_controller

All of the following metrics are part of the api_service_registration_controller metric group. All of the non-default metrics below can be turned on by adding api_service_registration_controller to the monitor config option extraGroups:

  • APIServiceRegistrationController_adds (cumulative)
    (Deprecated) Total number of adds handled by workqueue: APIServiceRegistrationController
  • APIServiceRegistrationController_depth (gauge)
    (Deprecated) Current depth of workqueue: APIServiceRegistrationController
  • APIServiceRegistrationController_longest_running_processor_microseconds (gauge)
    (Deprecated) How many microseconds has the longest running processor for APIServiceRegistrationController been running.
  • APIServiceRegistrationController_queue_latency (cumulative)
    (Deprecated) How long an item stays in workqueueAPIServiceRegistrationController before being requested. (sum)
  • APIServiceRegistrationController_queue_latency_count (cumulative)
    (Deprecated) How long an item stays in workqueueAPIServiceRegistrationController before being requested. (count)
  • APIServiceRegistrationController_queue_latency_quantile (gauge)
    (Deprecated) How long an item stays in workqueueAPIServiceRegistrationController before being requested. (quantized)
  • APIServiceRegistrationController_retries (cumulative)
    (Deprecated) Total number of retries handled by workqueue: APIServiceRegistrationController
  • APIServiceRegistrationController_unfinished_work_seconds (gauge)
    (Deprecated) How many seconds of work APIServiceRegistrationController has done that is in progress and hasn't been observed by work_duration. Large values indicate stuck threads. One can deduce the number of stuck threads by observing the rate at which this increases.
  • APIServiceRegistrationController_work_duration (cumulative)
    (Deprecated) How long processing an item from workqueueAPIServiceRegistrationController takes. (sum)
  • APIServiceRegistrationController_work_duration_count (cumulative)
    (Deprecated) How long processing an item from workqueueAPIServiceRegistrationController takes. (count)
  • APIServiceRegistrationController_work_duration_quantile (gauge)
    (Deprecated) How long processing an item from workqueueAPIServiceRegistrationController takes. (quantized)

Group apiserver_admission_controller

All of the following metrics are part of the apiserver_admission_controller metric group. All of the non-default metrics below can be turned on by adding apiserver_admission_controller to the monitor config option extraGroups:

  • apiserver_admission_controller_admission_duration_seconds (cumulative)
    Admission controller latency histogram in seconds, identified by name and broken out for each operation and API resource and type (validate or admit). (sum)
  • apiserver_admission_controller_admission_duration_seconds_bucket (cumulative)
    Admission controller latency histogram in seconds, identified by name and broken out for each operation and API resource and type (validate or admit). (bucket)
  • apiserver_admission_controller_admission_duration_seconds_count (cumulative)
    Admission controller latency histogram in seconds, identified by name and broken out for each operation and API resource and type (validate or admit). (count)
  • apiserver_admission_controller_admission_latencies_milliseconds (cumulative)
    (Deprecated) Admission controller latency histogram in milliseconds, identified by name and broken out for each operation and API resource and type (validate or admit). (sum)
  • apiserver_admission_controller_admission_latencies_milliseconds_bucket (cumulative)
    (Deprecated) Admission controller latency histogram in milliseconds, identified by name and broken out for each operation and API resource and type (validate or admit). (bucket)
  • apiserver_admission_controller_admission_latencies_milliseconds_count (cumulative)
    (Deprecated) Admission controller latency histogram in milliseconds, identified by name and broken out for each operation and API resource and type (validate or admit). (count)

Group apiserver_admission_step_admission

All of the following metrics are part of the apiserver_admission_step_admission metric group. All of the non-default metrics below can be turned on by adding apiserver_admission_step_admission to the monitor config option extraGroups:

  • apiserver_admission_step_admission_duration_seconds (cumulative)
    Admission sub-step latency histogram in seconds, broken out for each operation and API resource and step type (validate or admit). (sum)
  • apiserver_admission_step_admission_duration_seconds_bucket (cumulative)
    Admission sub-step latency histogram in seconds, broken out for each operation and API resource and step type (validate or admit). (bucket)
  • apiserver_admission_step_admission_duration_seconds_count (cumulative)
    Admission sub-step latency histogram in seconds, broken out for each operation and API resource and step type (validate or admit). (count)
  • apiserver_admission_step_admission_duration_seconds_summary (cumulative)
    Admission sub-step latency summary in seconds, broken out for each operation and API resource and step type (validate or admit). (sum)
  • apiserver_admission_step_admission_duration_seconds_summary_count (cumulative)
    Admission sub-step latency summary in seconds, broken out for each operation and API resource and step type (validate or admit). (count)
  • apiserver_admission_step_admission_duration_seconds_summary_quantile (gauge)
    Admission sub-step latency summary in seconds, broken out for each operation and API resource and step type (validate or admit). (quantized)
  • apiserver_admission_step_admission_latencies_milliseconds (cumulative)
    (Deprecated) Admission sub-step latency histogram in milliseconds, broken out for each operation and API resource and step type (validate or admit). (sum)
  • apiserver_admission_step_admission_latencies_milliseconds_bucket (cumulative)
    (Deprecated) Admission sub-step latency histogram in milliseconds, broken out for each operation and API resource and step type (validate or admit). (bucket)
  • apiserver_admission_step_admission_latencies_milliseconds_count (cumulative)
    (Deprecated) Admission sub-step latency histogram in milliseconds, broken out for each operation and API resource and step type (validate or admit). (count)
  • apiserver_admission_step_admission_latencies_milliseconds_summary (cumulative)
    (Deprecated) Admission sub-step latency summary in milliseconds, broken out for each operation and API resource and step type (validate or admit). (sum)
  • apiserver_admission_step_admission_latencies_milliseconds_summary_count (cumulative)
    (Deprecated) Admission sub-step latency summary in milliseconds, broken out for each operation and API resource and step type (validate or admit). (count)
  • apiserver_admission_step_admission_latencies_milliseconds_summary_quantile (gauge)
    (Deprecated) Admission sub-step latency summary in milliseconds, broken out for each operation and API resource and step type (validate or admit). (quantized)

Group apiserver_audit

All of the following metrics are part of the apiserver_audit metric group. All of the non-default metrics below can be turned on by adding apiserver_audit to the monitor config option extraGroups:

  • apiserver_audit_event_total (cumulative)
    Counter of audit events generated and sent to the audit backend.
  • apiserver_audit_requests_rejected_total (cumulative)
    Counter of apiserver requests rejected due to an error in audit logging backend.

Group apiserver_client

All of the following metrics are part of the apiserver_client metric group. All of the non-default metrics below can be turned on by adding apiserver_client to the monitor config option extraGroups:

  • apiserver_client_certificate_expiration_seconds (cumulative)
    Distribution of the remaining lifetime on the certificate used to authenticate a request. (sum)
  • apiserver_client_certificate_expiration_seconds_bucket (cumulative)
    Distribution of the remaining lifetime on the certificate used to authenticate a request. (bucket)
  • apiserver_client_certificate_expiration_seconds_count (cumulative)
    Distribution of the remaining lifetime on the certificate used to authenticate a request. (count)

Group apiserver_request

All of the following metrics are part of the apiserver_request metric group. All of the non-default metrics below can be turned on by adding apiserver_request to the monitor config option extraGroups:

  • apiserver_request_count (cumulative)
    (Deprecated) Counter of apiserver requests broken out for each verb, group, version, resource, scope, component, client, and HTTP response contentType and code.
  • apiserver_request_duration_seconds (cumulative)
    Response latency distribution in seconds for each verb, dry run value, group, version, resource, subresource, scope and component. (sum)
  • apiserver_request_duration_seconds_bucket (cumulative)
    Response latency distribution in seconds for each verb, dry run value, group, version, resource, subresource, scope and component. (bucket)
  • apiserver_request_duration_seconds_count (cumulative)
    Response latency distribution in seconds for each verb, dry run value, group, version, resource, subresource, scope and component. (count)
  • apiserver_request_latencies (cumulative)
    (Deprecated) Response latency distribution in microseconds for each verb, group, version, resource, subresource, scope and component. (sum)
  • apiserver_request_latencies_bucket (cumulative)
    (Deprecated) Response latency distribution in microseconds for each verb, group, version, resource, subresource, scope and component. (bucket)
  • apiserver_request_latencies_count (cumulative)
    (Deprecated) Response latency distribution in microseconds for each verb, group, version, resource, subresource, scope and component. (count)
  • apiserver_request_latencies_summary (cumulative)
    (Deprecated) Response latency summary in microseconds for each verb, group, version, resource, subresource, scope and component. (sum)
  • apiserver_request_latencies_summary_count (cumulative)
    (Deprecated) Response latency summary in microseconds for each verb, group, version, resource, subresource, scope and component. (count)
  • apiserver_request_latencies_summary_quantile (gauge)
    (Deprecated) Response latency summary in microseconds for each verb, group, version, resource, subresource, scope and component. (quantized)
  • apiserver_request_total (cumulative)
    Counter of apiserver requests broken out for each verb, dry run value, group, version, resource, scope, component, client, and HTTP response contentType and code.

Group apiserver_response

All of the following metrics are part of the apiserver_response metric group. All of the non-default metrics below can be turned on by adding apiserver_response to the monitor config option extraGroups:

  • apiserver_response_sizes (cumulative)
    Response size distribution in bytes for each group, version, verb, resource, subresource, scope and component. (sum)
  • apiserver_response_sizes_bucket (cumulative)
    Response size distribution in bytes for each group, version, verb, resource, subresource, scope and component. (bucket)
  • apiserver_response_sizes_count (cumulative)
    Response size distribution in bytes for each group, version, verb, resource, subresource, scope and component. (count)

Group apiserver_storage

All of the following metrics are part of the apiserver_storage metric group. All of the non-default metrics below can be turned on by adding apiserver_storage to the monitor config option extraGroups:

  • apiserver_storage_data_key_generation_duration_seconds (cumulative)
    Latencies in seconds of data encryption key(DEK) generation operations. (sum)
  • apiserver_storage_data_key_generation_duration_seconds_bucket (cumulative)
    Latencies in seconds of data encryption key(DEK) generation operations. (bucket)
  • apiserver_storage_data_key_generation_duration_seconds_count (cumulative)
    Latencies in seconds of data encryption key(DEK) generation operations. (count)
  • apiserver_storage_data_key_generation_failures_total (cumulative)
    Total number of failed data encryption key(DEK) generation operations.
  • apiserver_storage_data_key_generation_latencies_microseconds (cumulative)
    (Deprecated) Latencies in microseconds of data encryption key(DEK) generation operations. (sum)
  • apiserver_storage_data_key_generation_latencies_microseconds_bucket (cumulative)
    (Deprecated) Latencies in microseconds of data encryption key(DEK) generation operations. (bucket)
  • apiserver_storage_data_key_generation_latencies_microseconds_count (cumulative)
    (Deprecated) Latencies in microseconds of data encryption key(DEK) generation operations. (count)
  • apiserver_storage_envelope_transformation_cache_misses_total (cumulative)
    Total number of cache misses while accessing key decryption key(KEK).

Group autoregister

All of the following metrics are part of the autoregister metric group. All of the non-default metrics below can be turned on by adding autoregister to the monitor config option extraGroups:

  • autoregister_adds (cumulative)
    (Deprecated) Total number of adds handled by workqueue: autoregister
  • autoregister_depth (gauge)
    (Deprecated) Current depth of workqueue: autoregister
  • autoregister_longest_running_processor_microseconds (gauge)
    (Deprecated) How many microseconds has the longest running processor for autoregister been running.
  • autoregister_queue_latency (cumulative)
    (Deprecated) How long an item stays in workqueueautoregister before being requested. (sum)
  • autoregister_queue_latency_count (cumulative)
    (Deprecated) How long an item stays in workqueueautoregister before being requested. (count)
  • autoregister_queue_latency_quantile (gauge)
    (Deprecated) How long an item stays in workqueueautoregister before being requested. (quantized)
  • autoregister_retries (cumulative)
    (Deprecated) Total number of retries handled by workqueue: autoregister
  • autoregister_unfinished_work_seconds (gauge)
    (Deprecated) How many seconds of work autoregister has done that is in progress and hasn't been observed by work_duration. Large values indicate stuck threads. One can deduce the number of stuck threads by observing the rate at which this increases.
  • autoregister_work_duration (cumulative)
    (Deprecated) How long processing an item from workqueueautoregister takes. (sum)
  • autoregister_work_duration_count (cumulative)
    (Deprecated) How long processing an item from workqueueautoregister takes. (count)
  • autoregister_work_duration_quantile (gauge)
    (Deprecated) How long processing an item from workqueueautoregister takes. (quantized)

Group available_condition_controller

All of the following metrics are part of the available_condition_controller metric group. All of the non-default metrics below can be turned on by adding available_condition_controller to the monitor config option extraGroups:

  • AvailableConditionController_adds (cumulative)
    (Deprecated) Total number of adds handled by workqueue: AvailableConditionController
  • AvailableConditionController_depth (gauge)
    (Deprecated) Current depth of workqueue: AvailableConditionController
  • AvailableConditionController_longest_running_processor_microseconds (gauge)
    (Deprecated) How many microseconds has the longest running processor for AvailableConditionController been running.
  • AvailableConditionController_queue_latency (cumulative)
    (Deprecated) How long an item stays in workqueueAvailableConditionController before being requested. (sum)
  • AvailableConditionController_queue_latency_count (cumulative)
    (Deprecated) How long an item stays in workqueueAvailableConditionController before being requested. (count)
  • AvailableConditionController_queue_latency_quantile (gauge)
    (Deprecated) How long an item stays in workqueueAvailableConditionController before being requested. (quantized)
  • AvailableConditionController_retries (cumulative)
    (Deprecated) Total number of retries handled by workqueue: AvailableConditionController
  • AvailableConditionController_unfinished_work_seconds (gauge)
    (Deprecated) How many seconds of work AvailableConditionController has done that is in progress and hasn't been observed by work_duration. Large values indicate stuck threads. One can deduce the number of stuck threads by observing the rate at which this increases.
  • AvailableConditionController_work_duration (cumulative)
    (Deprecated) How long processing an item from workqueueAvailableConditionController takes. (sum)
  • AvailableConditionController_work_duration_count (cumulative)
    (Deprecated) How long processing an item from workqueueAvailableConditionController takes. (count)
  • AvailableConditionController_work_duration_quantile (gauge)
    (Deprecated) How long processing an item from workqueueAvailableConditionController takes. (quantized)

Group crd_autoregistration_controller

All of the following metrics are part of the crd_autoregistration_controller metric group. All of the non-default metrics below can be turned on by adding crd_autoregistration_controller to the monitor config option extraGroups:

  • crd_autoregistration_controller_adds (cumulative)
    (Deprecated) Total number of adds handled by workqueue: crd_autoregistration_controller
  • crd_autoregistration_controller_depth (gauge)
    (Deprecated) Current depth of workqueue: crd_autoregistration_controller
  • crd_autoregistration_controller_longest_running_processor_microseconds (gauge)
    (Deprecated) How many microseconds has the longest running processor for crd_autoregistration_controller been running.
  • crd_autoregistration_controller_queue_latency (cumulative)
    (Deprecated) How long an item stays in workqueuecrd_autoregistration_controller before being requested. (sum)
  • crd_autoregistration_controller_queue_latency_count (cumulative)
    (Deprecated) How long an item stays in workqueuecrd_autoregistration_controller before being requested. (count)
  • crd_autoregistration_controller_queue_latency_quantile (gauge)
    (Deprecated) How long an item stays in workqueuecrd_autoregistration_controller before being requested. (quantized)
  • crd_autoregistration_controller_retries (cumulative)
    (Deprecated) Total number of retries handled by workqueue: crd_autoregistration_controller
  • crd_autoregistration_controller_unfinished_work_seconds (gauge)
    (Deprecated) How many seconds of work crd_autoregistration_controller has done that is in progress and hasn't been observed by work_duration. Large values indicate stuck threads. One can deduce the number of stuck threads by observing the rate at which this increases.
  • crd_autoregistration_controller_work_duration (cumulative)
    (Deprecated) How long processing an item from workqueuecrd_autoregistration_controller takes. (sum)
  • crd_autoregistration_controller_work_duration_count (cumulative)
    (Deprecated) How long processing an item from workqueuecrd_autoregistration_controller takes. (count)
  • crd_autoregistration_controller_work_duration_quantile (gauge)
    (Deprecated) How long processing an item from workqueuecrd_autoregistration_controller takes. (quantized)

Group crd_establishing

All of the following metrics are part of the crd_establishing metric group. All of the non-default metrics below can be turned on by adding crd_establishing to the monitor config option extraGroups:

  • crdEstablishing_adds (cumulative)
    (Deprecated) Total number of adds handled by workqueue: crdEstablishing
  • crdEstablishing_depth (gauge)
    (Deprecated) Current depth of workqueue: crdEstablishing
  • crdEstablishing_longest_running_processor_microseconds (gauge)
    (Deprecated) How many microseconds has the longest running processor for crdEstablishing been running.
  • crdEstablishing_queue_latency (cumulative)
    (Deprecated) How long an item stays in workqueuecrdEstablishing before being requested. (sum)
  • crdEstablishing_queue_latency_count (cumulative)
    (Deprecated) How long an item stays in workqueuecrdEstablishing before being requested. (count)
  • crdEstablishing_queue_latency_quantile (gauge)
    (Deprecated) How long an item stays in workqueuecrdEstablishing before being requested. (quantized)
  • crdEstablishing_retries (cumulative)
    (Deprecated) Total number of retries handled by workqueue: crdEstablishing
  • crdEstablishing_unfinished_work_seconds (gauge)
    (Deprecated) How many seconds of work crdEstablishing has done that is in progress and hasn't been observed by work_duration. Large values indicate stuck threads. One can deduce the number of stuck threads by observing the rate at which this increases.
  • crdEstablishing_work_duration (cumulative)
    (Deprecated) How long processing an item from workqueuecrdEstablishing takes. (sum)
  • crdEstablishing_work_duration_count (cumulative)
    (Deprecated) How long processing an item from workqueuecrdEstablishing takes. (count)
  • crdEstablishing_work_duration_quantile (gauge)
    (Deprecated) How long processing an item from workqueuecrdEstablishing takes. (quantized)

Group crd_finalizer

All of the following metrics are part of the crd_finalizer metric group. All of the non-default metrics below can be turned on by adding crd_finalizer to the monitor config option extraGroups:

  • crd_finalizer_adds (cumulative)
    (Deprecated) Total number of adds handled by workqueue: crd_finalizer
  • crd_finalizer_depth (gauge)
    (Deprecated) Current depth of workqueue: crd_finalizer
  • crd_finalizer_longest_running_processor_microseconds (gauge)
    (Deprecated) How many microseconds has the longest running processor for crd_finalizer been running.
  • crd_finalizer_queue_latency (cumulative)
    (Deprecated) How long an item stays in workqueuecrd_finalizer before being requested. (sum)
  • crd_finalizer_queue_latency_count (cumulative)
    (Deprecated) How long an item stays in workqueuecrd_finalizer before being requested. (count)
  • crd_finalizer_queue_latency_quantile (gauge)
    (Deprecated) How long an item stays in workqueuecrd_finalizer before being requested. (quantized)
  • crd_finalizer_retries (cumulative)
    (Deprecated) Total number of retries handled by workqueue: crd_finalizer
  • crd_finalizer_unfinished_work_seconds (gauge)
    (Deprecated) How many seconds of work crd_finalizer has done that is in progress and hasn't been observed by work_duration. Large values indicate stuck threads. One can deduce the number of stuck threads by observing the rate at which this increases.
  • crd_finalizer_work_duration (cumulative)
    (Deprecated) How long processing an item from workqueuecrd_finalizer takes. (sum)
  • crd_finalizer_work_duration_count (cumulative)
    (Deprecated) How long processing an item from workqueuecrd_finalizer takes. (count)
  • crd_finalizer_work_duration_quantile (gauge)
    (Deprecated) How long processing an item from workqueuecrd_finalizer takes. (quantized)

Group crd_naming_condition_controller

All of the following metrics are part of the crd_naming_condition_controller metric group. All of the non-default metrics below can be turned on by adding crd_naming_condition_controller to the monitor config option extraGroups:

  • crd_naming_condition_controller_adds (cumulative)
    (Deprecated) Total number of adds handled by workqueue: crd_naming_condition_controller
  • crd_naming_condition_controller_depth (gauge)
    (Deprecated) Current depth of workqueue: crd_naming_condition_controller
  • crd_naming_condition_controller_longest_running_processor_microseconds (gauge)
    (Deprecated) How many microseconds has the longest running processor for crd_naming_condition_controller been running.
  • crd_naming_condition_controller_queue_latency (cumulative)
    (Deprecated) How long an item stays in workqueuecrd_naming_condition_controller before being requested. (sum)
  • crd_naming_condition_controller_queue_latency_count (cumulative)
    (Deprecated) How long an item stays in workqueuecrd_naming_condition_controller before being requested. (count)
  • crd_naming_condition_controller_queue_latency_quantile (gauge)
    (Deprecated) How long an item stays in workqueuecrd_naming_condition_controller before being requested. (quantized)
  • crd_naming_condition_controller_retries (cumulative)
    (Deprecated) Total number of retries handled by workqueue: crd_naming_condition_controller
  • crd_naming_condition_controller_unfinished_work_seconds (gauge)
    (Deprecated) How many seconds of work crd_naming_condition_controller has done that is in progress and hasn't been observed by work_duration. Large values indicate stuck threads. One can deduce the number of stuck threads by observing the rate at which this increases.
  • crd_naming_condition_controller_work_duration (cumulative)
    (Deprecated) How long processing an item from workqueuecrd_naming_condition_controller takes. (sum)
  • crd_naming_condition_controller_work_duration_count (cumulative)
    (Deprecated) How long processing an item from workqueuecrd_naming_condition_controller takes. (count)
  • crd_naming_condition_controller_work_duration_quantile (gauge)
    (Deprecated) How long processing an item from workqueuecrd_naming_condition_controller takes. (quantized)

Group discovery_controller

All of the following metrics are part of the discovery_controller metric group. All of the non-default metrics below can be turned on by adding discovery_controller to the monitor config option extraGroups:

  • DiscoveryController_adds (cumulative)
    (Deprecated) Total number of adds handled by workqueue: DiscoveryController
  • DiscoveryController_depth (gauge)
    (Deprecated) Current depth of workqueue: DiscoveryController
  • DiscoveryController_longest_running_processor_microseconds (gauge)
    (Deprecated) How many microseconds has the longest running processor for DiscoveryController been running.
  • DiscoveryController_queue_latency (cumulative)
    (Deprecated) How long an item stays in workqueueDiscoveryController before being requested. (sum)
  • DiscoveryController_queue_latency_count (cumulative)
    (Deprecated) How long an item stays in workqueueDiscoveryController before being requested. (count)
  • DiscoveryController_queue_latency_quantile (gauge)
    (Deprecated) How long an item stays in workqueueDiscoveryController before being requested. (quantized)
  • DiscoveryController_retries (cumulative)
    (Deprecated) Total number of retries handled by workqueue: DiscoveryController
  • DiscoveryController_unfinished_work_seconds (gauge)
    (Deprecated) How many seconds of work DiscoveryController has done that is in progress and hasn't been observed by work_duration. Large values indicate stuck threads. One can deduce the number of stuck threads by observing the rate at which this increases.
  • DiscoveryController_work_duration (cumulative)
    (Deprecated) How long processing an item from workqueueDiscoveryController takes. (sum)
  • DiscoveryController_work_duration_count (cumulative)
    (Deprecated) How long processing an item from workqueueDiscoveryController takes. (count)
  • DiscoveryController_work_duration_quantile (gauge)
    (Deprecated) How long processing an item from workqueueDiscoveryController takes. (quantized)

Group etcd

All of the following metrics are part of the etcd metric group. All of the non-default metrics below can be turned on by adding etcd to the monitor config option extraGroups:

  • etcd_helper_cache_entry (cumulative)
    Counter of etcd helper cache entries. This can be different from etcd_helper_cache_miss_count because two concurrent threads can miss the cache and generate the same entry twice.
  • etcd_helper_cache_entry_count (cumulative)
    (Deprecated) Counter of etcd helper cache entries. This can be different from etcd_helper_cache_miss_count because two concurrent threads can miss the cache and generate the same entry twice.
  • etcd_helper_cache_hit (cumulative)
    Counter of etcd helper cache hits.
  • etcd_helper_cache_hit_count (cumulative)
    (Deprecated) Counter of etcd helper cache hits.
  • etcd_helper_cache_miss (cumulative)
    Counter of etcd helper cache miss.
  • etcd_helper_cache_miss_count (cumulative)
    (Deprecated) Counter of etcd helper cache miss.
  • etcd_object_counts (gauge)
    Number of stored objects at the time of last check split by kind.
  • etcd_request_cache_add_duration_seconds (cumulative)
    Latency in seconds of adding an object to etcd cache (sum)
  • etcd_request_cache_add_duration_seconds_bucket (cumulative)
    Latency in seconds of adding an object to etcd cache (bucket)
  • etcd_request_cache_add_duration_seconds_count (cumulative)
    Latency in seconds of adding an object to etcd cache (count)
  • etcd_request_cache_add_latencies_summary (cumulative)
    (Deprecated) Latency in microseconds of adding an object to etcd cache (sum)
  • etcd_request_cache_add_latencies_summary_count (cumulative)
    (Deprecated) Latency in microseconds of adding an object to etcd cache (count)
  • etcd_request_cache_add_latencies_summary_quantile (gauge)
    (Deprecated) Latency in microseconds of adding an object to etcd cache (quantized)
  • etcd_request_cache_get_duration_seconds (cumulative)
    Latency in seconds of getting an object from etcd cache (sum)
  • etcd_request_cache_get_duration_seconds_bucket (cumulative)
    Latency in seconds of getting an object from etcd cache (bucket)
  • etcd_request_cache_get_duration_seconds_count (cumulative)
    Latency in seconds of getting an object from etcd cache (count)
  • etcd_request_cache_get_latencies_summary (cumulative)
    (Deprecated) Latency in microseconds of getting an object from etcd cache (sum)
  • etcd_request_cache_get_latencies_summary_count (cumulative)
    (Deprecated) Latency in microseconds of getting an object from etcd cache (count)
  • etcd_request_cache_get_latencies_summary_quantile (gauge)
    (Deprecated) Latency in microseconds of getting an object from etcd cache (quantized)

Group grpc_client

All of the following metrics are part of the grpc_client metric group. All of the non-default metrics below can be turned on by adding grpc_client to the monitor config option extraGroups:

  • grpc_client_handled_total (cumulative)
    Total number of RPCs completed by the client, regardless of success or failure.
  • grpc_client_msg_received_total (cumulative)
    Total number of RPC stream messages received by the client.
  • grpc_client_msg_sent_total (cumulative)
    Total number of gRPC stream messages sent by the client.
  • grpc_client_started_total (cumulative)
    Total number of RPCs started on the client.

Group http_request

All of the following metrics are part of the http_request metric group. All of the non-default metrics below can be turned on by adding http_request to the monitor config option extraGroups:

  • http_request_duration_microseconds (cumulative)
    The HTTP request latencies in microseconds. (sum)
  • http_request_duration_microseconds_count (cumulative)
    The HTTP request latencies in microseconds. (count)
  • http_request_duration_microseconds_quantile (gauge)
    The HTTP request latencies in microseconds. (quantized)
  • http_request_size_bytes (cumulative)
    The HTTP request sizes in bytes. (sum)
  • http_request_size_bytes_count (cumulative)
    The HTTP request sizes in bytes. (count)
  • http_request_size_bytes_quantile (gauge)
    The HTTP request sizes in bytes. (quantized)
  • http_requests (cumulative)
    Total number of HTTP requests made.

Group http_response

All of the following metrics are part of the http_response metric group. All of the non-default metrics below can be turned on by adding http_response to the monitor config option extraGroups:

  • http_response_size_bytes (cumulative)
    The HTTP response sizes in bytes. (sum)
  • http_response_size_bytes_count (cumulative)
    The HTTP response sizes in bytes. (count)
  • http_response_size_bytes_quantile (gauge)
    The HTTP response sizes in bytes. (quantized)

Group prometheus_go

All of the following metrics are part of the prometheus_go metric group. All of the non-default metrics below can be turned on by adding prometheus_go to the monitor config option extraGroups:

  • go_gc_duration_seconds (cumulative)
    A summary of the GC invocation durations. (sum)
  • go_gc_duration_seconds_count (cumulative)
    A summary of the GC invocation durations. (count)
  • go_gc_duration_seconds_quantile (gauge)
    A summary of the GC invocation durations. (quantized)
  • go_goroutines (gauge)
    Number of goroutines that currently exist.
  • go_info (gauge)
    Information about the Go environment.
  • go_memstats_alloc_bytes_total (cumulative)
    Total number of bytes allocated, even if freed.
  • go_memstats_buck_hash_sys_bytes (gauge)
    Number of bytes used by the profiling bucket hash table.
  • go_memstats_frees_total (cumulative)
    Total number of frees.
  • go_memstats_gc_cpu_fraction (gauge)
    The fraction of this program's available CPU time used by the GC since the program started.
  • go_memstats_gc_sys_bytes (gauge)
    Number of bytes used for garbage collection system metadata.
  • go_memstats_heap_alloc_bytes (gauge)
    Number of heap bytes allocated and still in use.
  • go_memstats_heap_idle_bytes (gauge)
    Number of heap bytes waiting to be used.
  • go_memstats_heap_inuse_bytes (gauge)
    Number of heap bytes that are in use.
  • go_memstats_heap_objects (gauge)
    Number of allocated objects.
  • go_memstats_heap_released_bytes (gauge)
    Number of heap bytes released to OS.
  • go_memstats_heap_sys_bytes (gauge)
    Number of heap bytes obtained from system.
  • go_memstats_last_gc_time_seconds (gauge)
    Number of seconds since 1970 of last garbage collection.
  • go_memstats_lookups_total (cumulative)
    Total number of pointer lookups.
  • go_memstats_mallocs_total (cumulative)
    Total number of mallocs.
  • go_memstats_mcache_inuse_bytes (gauge)
    Number of bytes in use by mcache structures.
  • go_memstats_mcache_sys_bytes (gauge)
    Number of bytes used for mcache structures obtained from system.
  • go_memstats_mspan_inuse_bytes (gauge)
    Number of bytes in use by mspan structures.
  • go_memstats_mspan_sys_bytes (gauge)
    Number of bytes used for mspan structures obtained from system.
  • go_memstats_next_gc_bytes (gauge)
    Number of heap bytes when next garbage collection will take place.
  • go_memstats_other_sys_bytes (gauge)
    Number of bytes used for other system allocations.
  • go_memstats_stack_inuse_bytes (gauge)
    Number of bytes in use by the stack allocator.
  • go_memstats_stack_sys_bytes (gauge)
    Number of bytes obtained from system for stack allocator.
  • go_memstats_sys_bytes (gauge)
    Number of bytes obtained from system.
  • go_threads (gauge)
    Number of OS threads created.

Group prometheus_process

All of the following metrics are part of the prometheus_process metric group. All of the non-default metrics below can be turned on by adding prometheus_process to the monitor config option extraGroups:

  • process_cpu_seconds_total (cumulative)
    Total user and system CPU time spent in seconds.
  • process_max_fds (gauge)
    Maximum number of open file descriptors.
  • process_open_fds (gauge)
    Number of open file descriptors.
  • process_resident_memory_bytes (gauge)
    Resident memory size in bytes.
  • process_start_time_seconds (gauge)
    Start time of the process since unix epoch in seconds.
  • process_virtual_memory_bytes (gauge)
    Virtual memory size in bytes.
  • process_virtual_memory_max_bytes (gauge)
    Maximum amount of virtual memory available in bytes.

Group rest_client

All of the following metrics are part of the rest_client metric group. All of the non-default metrics below can be turned on by adding rest_client to the monitor config option extraGroups:

  • rest_client_request_duration_seconds (cumulative)
    Request latency in seconds. Broken down by verb and URL. (sum)
  • rest_client_request_duration_seconds_bucket (cumulative)
    Request latency in seconds. Broken down by verb and URL. (bucket)
  • rest_client_request_duration_seconds_count (cumulative)
    Request latency in seconds. Broken down by verb and URL. (count)
  • rest_client_request_latency_seconds (cumulative)
    (Deprecated) Request latency in seconds. Broken down by verb and URL. (sum)
  • rest_client_request_latency_seconds_bucket (cumulative)
    (Deprecated) Request latency in seconds. Broken down by verb and URL. (bucket)
  • rest_client_request_latency_seconds_count (cumulative)
    (Deprecated) Request latency in seconds. Broken down by verb and URL. (count)
  • rest_client_requests_total (cumulative)
    Number of HTTP requests, partitioned by status code, method, and host.

Group ssh_tunnel

All of the following metrics are part of the ssh_tunnel metric group. All of the non-default metrics below can be turned on by adding ssh_tunnel to the monitor config option extraGroups:

  • ssh_tunnel_open_count (cumulative)
    Counter of ssh tunnel total open attempts
  • ssh_tunnel_open_fail_count (cumulative)
    Counter of ssh tunnel failed open attempts

Group token

All of the following metrics are part of the token metric group. All of the non-default metrics below can be turned on by adding token to the monitor config option extraGroups:

  • get_token_count (cumulative)
    Counter of total Token() requests to the alternate token source
  • get_token_fail_count (cumulative)
    Counter of failed Token() requests to the alternate token source

Group workqueue

All of the following metrics are part of the workqueue metric group. All of the non-default metrics below can be turned on by adding workqueue to the monitor config option extraGroups:

  • workqueue_adds_total (cumulative)
    Total number of adds handled by workqueue
  • workqueue_depth (gauge)
    Current depth of workqueue
  • workqueue_longest_running_processor_seconds (gauge)
    How many seconds has the longest running processor for workqueue been running.
  • workqueue_queue_duration_seconds (cumulative)
    How long in seconds an item stays in workqueue before being requested. (sum)
  • workqueue_queue_duration_seconds_bucket (cumulative)
    How long in seconds an item stays in workqueue before being requested. (bucket)
  • workqueue_queue_duration_seconds_count (cumulative)
    How long in seconds an item stays in workqueue before being requested. (count)
  • workqueue_retries_total (cumulative)
    Total number of retries handled by workqueue
  • workqueue_unfinished_work_seconds (gauge)
    How many seconds of work has done that is in progress and hasn't been observed by work_duration. Large values indicate stuck threads. One can deduce the number of stuck threads by observing the rate at which this increases.
  • workqueue_work_duration_seconds (cumulative)
    How long in seconds processing an item from workqueue takes. (sum)
  • workqueue_work_duration_seconds_bucket (cumulative)
    How long in seconds processing an item from workqueue takes. (bucket)
  • workqueue_work_duration_seconds_count (cumulative)
    How long in seconds processing an item from workqueue takes. (count)

Non-default metrics (version 4.7.0+)

To emit metrics that are not default, you can add those metrics in the generic monitor-level extraMetrics config option. Metrics that are derived from specific configuration options that do not appear in the above list of metrics do not need to be added to extraMetrics.

To see a list of metrics that will be emitted you can run agent-status monitors after configuring this monitor in a running agent instance.