Skip to content

Commit

Permalink
Add the total ranges changefeed metric to lagging ranges section (#19196
Browse files Browse the repository at this point in the history
)
  • Loading branch information
kathancox authored Dec 11, 2024
1 parent f7a6b25 commit 2a56ebd
Show file tree
Hide file tree
Showing 11 changed files with 27 additions and 11 deletions.
3 changes: 2 additions & 1 deletion src/current/_includes/releases/v23.2/v23.2.13.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ Release Date: October 17, 2024

[#130664][#130664]

- The new [metric]({% link v23.2/metrics.md %}) `changefeed.total_ranges` allows observation of the number of changes that are watched by a changefeed aggregator. It uses the same polling interval as `changefeed.lagging_ranges`, which is controlled by the changefeed option `lagging_ranges_polling_interval`. [#130984][#130984]
- The new [metric]({% link v23.2/metrics.md %}) `changefeed.total_ranges` allows observation of the number of ranges that are watched by a changefeed aggregator. It uses the same polling interval as `changefeed.lagging_ranges`, which is controlled by the changefeed option `lagging_ranges_polling_interval`. [#130984][#130984]
- The following groups of [metrics]({% link v23.2/metrics.md %}) and [logs]({% link v23.2/logging.md %}) have been renamed to include the buffer they are associated with. The previous metrics are still maintained for backward compatibility.
- `changefeed.buffer_entries.*`
- `changefeed.buffer_entries_mem.*`
Expand Down Expand Up @@ -81,6 +81,7 @@ Release Date: October 17, 2024
[#130664]: https://github.com/cockroachdb/cockroach/pull/130664
[#130790]: https://github.com/cockroachdb/cockroach/pull/130790
[#130919]: https://github.com/cockroachdb/cockroach/pull/130919
[#130984]: https://github.com/cockroachdb/cockroach/pull/130984
[#130988]: https://github.com/cockroachdb/cockroach/pull/130988
[#131065]: https://github.com/cockroachdb/cockroach/pull/131065
[#131128]: https://github.com/cockroachdb/cockroach/pull/131128
Expand Down
6 changes: 4 additions & 2 deletions src/current/_includes/v23.1/cdc/lagging-ranges.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@
{% include_cached new-in.html version="v23.1.12" %} Use the `changefeed.lagging_ranges` metric to track the number of ranges that are behind in a changefeed. This is calculated based on the [cluster settings]({% link {{ page.version.version }}/cluster-settings.md %}):
{% include_cached new-in.html version="v23.1.12" %} Use the `changefeed.lagging_ranges` metric to track the number of [ranges]({% link {{ page.version.version }}/architecture/overview.md %}#architecture-range) that are behind in a changefeed. This is calculated based on the [cluster settings]({% link {{ page.version.version }}/cluster-settings.md %}):

- `changefeed.lagging_ranges_threshold` sets a duration from the present that determines the length of time a range is considered to be lagging behind, which will then track in the [`lagging_ranges`]({% link {{ page.version.version }}/monitor-and-debug-changefeeds.md %}#using-changefeed-metrics-labels) metric. Note that ranges undergoing an [initial scan]({% link {{ page.version.version }}/create-changefeed.md %}#initial-scan) for longer than the threshold duration are considered to be lagging. Starting a changefeed with an initial scan on a large table will likely increment the metric for each range in the table. As ranges complete the initial scan, the number of ranges lagging behind will decrease.
- **Default:** `3m`
- `changefeed.lagging_ranges_polling_interval` sets the interval rate for when lagging ranges are checked and the `lagging_ranges` metric is updated. Polling adds latency to the `lagging_ranges` metric being updated. For example, if a range falls behind by 3 minutes, the metric may not update until an additional minute afterward.
- **Default:** `1m`

{% include_cached new-in.html version="v23.1.29" %} Use the `changefeed.total_ranges` metric to monitor the number of ranges that are watched by [aggregator processors]({% link {{ page.version.version }}/how-does-an-enterprise-changefeed-work.md %}) participating in the changefeed job. If you're experiencing lagging ranges, `changefeed.total_ranges` may indicate that the number of ranges watched by aggregator processors in the job is unbalanced. You may want to try [pausing]({% link {{ page.version.version }}/pause-job.md %}) the changefeed and then [resuming]({% link {{ page.version.version }}/resume-job.md %}) it, so that the changefeed replans the work in the cluster. `changefeed.total_ranges` shares the same polling interval as the `changefeed.lagging_ranges` metric, which is controlled by the `changefeed.lagging_ranges_polling_interval` cluster setting.

{{site.data.alerts.callout_success}}
You can use the [`metrics_label`]({% link {{ page.version.version }}/monitor-and-debug-changefeeds.md %}#using-changefeed-metrics-labels) option to track the `lagging_ranges` metric per changefeed.
You can use the [`metrics_label`]({% link {{ page.version.version }}/monitor-and-debug-changefeeds.md %}#using-changefeed-metrics-labels) option to track the `lagging_ranges` and `total_ranges` metric per changefeed.
{{site.data.alerts.end}}
6 changes: 4 additions & 2 deletions src/current/_includes/v23.2/cdc/lagging-ranges.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@
{% include_cached new-in.html version="v23.2" %} Use the `changefeed.lagging_ranges` metric to track the number of ranges that are behind in a changefeed. This is calculated based on the [changefeed options]({% link {{ page.version.version }}/create-changefeed.md %}#options):
{% include_cached new-in.html version="v23.2" %} Use the `changefeed.lagging_ranges` metric to track the number of [ranges]({% link {{ page.version.version }}/architecture/overview.md %}#architecture-range) that are behind in a changefeed. This is calculated based on the [changefeed options]({% link {{ page.version.version }}/create-changefeed.md %}#options):

- `lagging_ranges_threshold` sets a duration from the present that determines the length of time a range is considered to be lagging behind, which will then track in the [`lagging_ranges`]({% link {{ page.version.version }}/monitor-and-debug-changefeeds.md %}#lagging-ranges-metric) metric. Note that ranges undergoing an [initial scan]({% link {{ page.version.version }}/create-changefeed.md %}#initial-scan) for longer than the threshold duration are considered to be lagging. Starting a changefeed with an initial scan on a large table will likely increment the metric for each range in the table. As ranges complete the initial scan, the number of ranges lagging behind will decrease.
- **Default:** `3m`
- `lagging_ranges_polling_interval` sets the interval rate for when lagging ranges are checked and the `lagging_ranges` metric is updated. Polling adds latency to the `lagging_ranges` metric being updated. For example, if a range falls behind by 3 minutes, the metric may not update until an additional minute afterward.
- **Default:** `1m`

{% include_cached new-in.html version="v23.2.13" %} Use the `changefeed.total_ranges` metric to monitor the number of ranges that are watched by [aggregator processors]({% link {{ page.version.version }}/how-does-an-enterprise-changefeed-work.md %}) participating in the changefeed job. If you're experiencing lagging ranges, `changefeed.total_ranges` may indicate that the number of ranges watched by aggregator processors in the job is unbalanced. You may want to try [pausing]({% link {{ page.version.version }}/pause-job.md %}) the changefeed and then [resuming]({% link {{ page.version.version }}/resume-job.md %}) it, so that the changefeed replans the work in the cluster. `changefeed.total_ranges` shares the same polling interval as the `changefeed.lagging_ranges` metric, which is controlled by the `lagging_ranges_polling_interval` option.

{{site.data.alerts.callout_success}}
You can use the [`metrics_label`]({% link {{ page.version.version }}/monitor-and-debug-changefeeds.md %}#using-changefeed-metrics-labels) option to track the `lagging_ranges` metric per changefeed.
You can use the [`metrics_label`]({% link {{ page.version.version }}/monitor-and-debug-changefeeds.md %}#using-changefeed-metrics-labels) option to track the `lagging_ranges` and `total_ranges` metric per changefeed.
{{site.data.alerts.end}}
6 changes: 4 additions & 2 deletions src/current/_includes/v24.1/cdc/lagging-ranges.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@
Use the `changefeed.lagging_ranges` metric to track the number of ranges that are behind in a changefeed. This is calculated based on the [changefeed options]({% link {{ page.version.version }}/create-changefeed.md %}#options):
Use the `changefeed.lagging_ranges` metric to track the number of [ranges]({% link {{ page.version.version }}/architecture/overview.md %}#range) that are behind in a changefeed. This is calculated based on the [changefeed options]({% link {{ page.version.version }}/create-changefeed.md %}#options):

- `lagging_ranges_threshold` sets a duration from the present that determines the length of time a range is considered to be lagging behind, which will then track in the [`lagging_ranges`]({% link {{ page.version.version }}/monitor-and-debug-changefeeds.md %}#lagging-ranges-metric) metric. Note that ranges undergoing an [initial scan]({% link {{ page.version.version }}/create-changefeed.md %}#initial-scan) for longer than the threshold duration are considered to be lagging. Starting a changefeed with an initial scan on a large table will likely increment the metric for each range in the table. As ranges complete the initial scan, the number of ranges lagging behind will decrease.
- **Default:** `3m`
- `lagging_ranges_polling_interval` sets the interval rate for when lagging ranges are checked and the `lagging_ranges` metric is updated. Polling adds latency to the `lagging_ranges` metric being updated. For example, if a range falls behind by 3 minutes, the metric may not update until an additional minute afterward.
- **Default:** `1m`

{% include_cached new-in.html version="v24.1.6" %} Use the `changefeed.total_ranges` metric to monitor the number of ranges that are watched by [aggregator processors]({% link {{ page.version.version }}/how-does-an-enterprise-changefeed-work.md %}) participating in the changefeed job. If you're experiencing lagging ranges, `changefeed.total_ranges` may indicate that the number of ranges watched by aggregator processors in the job is unbalanced. You may want to try [pausing]({% link {{ page.version.version }}/pause-job.md %}) the changefeed and then [resuming]({% link {{ page.version.version }}/resume-job.md %}) it, so that the changefeed replans the work in the cluster. `changefeed.total_ranges` shares the same polling interval as the `changefeed.lagging_ranges` metric, which is controlled by the `lagging_ranges_polling_interval` option.

{{site.data.alerts.callout_success}}
You can use the [`metrics_label`]({% link {{ page.version.version }}/monitor-and-debug-changefeeds.md %}#using-changefeed-metrics-labels) option to track the `lagging_ranges` metric per changefeed.
You can use the [`metrics_label`]({% link {{ page.version.version }}/monitor-and-debug-changefeeds.md %}#using-changefeed-metrics-labels) option to track the `lagging_ranges` and `total_ranges` metric per changefeed.
{{site.data.alerts.end}}
6 changes: 4 additions & 2 deletions src/current/_includes/v24.2/cdc/lagging-ranges.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@
Use the `changefeed.lagging_ranges` metric to track the number of ranges that are behind in a changefeed. This is calculated based on the [changefeed options]({% link {{ page.version.version }}/create-changefeed.md %}#options):
Use the `changefeed.lagging_ranges` metric to track the number of [ranges]({% link {{ page.version.version }}/architecture/overview.md %}#architecture-range) that are behind in a changefeed. This is calculated based on the [changefeed options]({% link {{ page.version.version }}/create-changefeed.md %}#options):

- `lagging_ranges_threshold` sets a duration from the present that determines the length of time a range is considered to be lagging behind, which will then track in the [`lagging_ranges`]({% link {{ page.version.version }}/monitor-and-debug-changefeeds.md %}#lagging-ranges-metric) metric. Note that ranges undergoing an [initial scan]({% link {{ page.version.version }}/create-changefeed.md %}#initial-scan) for longer than the threshold duration are considered to be lagging. Starting a changefeed with an initial scan on a large table will likely increment the metric for each range in the table. As ranges complete the initial scan, the number of ranges lagging behind will decrease.
- **Default:** `3m`
- `lagging_ranges_polling_interval` sets the interval rate for when lagging ranges are checked and the `lagging_ranges` metric is updated. Polling adds latency to the `lagging_ranges` metric being updated. For example, if a range falls behind by 3 minutes, the metric may not update until an additional minute afterward.
- **Default:** `1m`

{% include_cached new-in.html version="v24.2.4" %} Use the `changefeed.total_ranges` metric to monitor the number of ranges that are watched by [aggregator processors]({% link {{ page.version.version }}/how-does-an-enterprise-changefeed-work.md %}) participating in the changefeed job. If you're experiencing lagging ranges, `changefeed.total_ranges` may indicate that the number of ranges watched by aggregator processors in the job is unbalanced. You may want to try [pausing]({% link {{ page.version.version }}/pause-job.md %}) the changefeed and then [resuming]({% link {{ page.version.version }}/resume-job.md %}) it, so that the changefeed replans the work in the cluster. `changefeed.total_ranges` shares the same polling interval as the `changefeed.lagging_ranges` metric, which is controlled by the `lagging_ranges_polling_interval` option.

{{site.data.alerts.callout_success}}
You can use the [`metrics_label`]({% link {{ page.version.version }}/monitor-and-debug-changefeeds.md %}#using-changefeed-metrics-labels) option to track the `lagging_ranges` metric per changefeed.
You can use the [`metrics_label`]({% link {{ page.version.version }}/monitor-and-debug-changefeeds.md %}#using-changefeed-metrics-labels) option to track the `lagging_ranges` and `total_ranges` metric per changefeed.
{{site.data.alerts.end}}
6 changes: 4 additions & 2 deletions src/current/_includes/v24.3/cdc/lagging-ranges.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@
Use the `changefeed.lagging_ranges` metric to track the number of ranges that are behind in a changefeed. This is calculated based on the [changefeed options]({% link {{ page.version.version }}/create-changefeed.md %}#options):
Use the `changefeed.lagging_ranges` metric to track the number of [ranges]({% link {{ page.version.version }}/architecture/overview.md %}#range) that are behind in a changefeed. This is calculated based on the [changefeed options]({% link {{ page.version.version }}/create-changefeed.md %}#options):

- `lagging_ranges_threshold` sets a duration from the present that determines the length of time a range is considered to be lagging behind, which will then track in the [`lagging_ranges`]({% link {{ page.version.version }}/monitor-and-debug-changefeeds.md %}#lagging-ranges-metric) metric. Note that ranges undergoing an [initial scan]({% link {{ page.version.version }}/create-changefeed.md %}#initial-scan) for longer than the threshold duration are considered to be lagging. Starting a changefeed with an initial scan on a large table will likely increment the metric for each range in the table. As ranges complete the initial scan, the number of ranges lagging behind will decrease.
- **Default:** `3m`
- `lagging_ranges_polling_interval` sets the interval rate for when lagging ranges are checked and the `lagging_ranges` metric is updated. Polling adds latency to the `lagging_ranges` metric being updated. For example, if a range falls behind by 3 minutes, the metric may not update until an additional minute afterward.
- **Default:** `1m`

{% include_cached new-in.html version="v24.3" %} Use the `changefeed.total_ranges` metric to monitor the number of ranges that are watched by [aggregator processors]({% link {{ page.version.version }}/how-does-an-enterprise-changefeed-work.md %}) participating in the changefeed job. If you're experiencing lagging ranges, `changefeed.total_ranges` may indicate that the number of ranges watched by aggregator processors in the job is unbalanced. You may want to try [pausing]({% link {{ page.version.version }}/pause-job.md %}) the changefeed and then [resuming]({% link {{ page.version.version }}/resume-job.md %}) it, so that the changefeed replans the work in the cluster. `changefeed.total_ranges` shares the same polling interval as the `changefeed.lagging_ranges` metric, which is controlled by the `lagging_ranges_polling_interval` option.

{{site.data.alerts.callout_success}}
You can use the [`metrics_label`]({% link {{ page.version.version }}/monitor-and-debug-changefeeds.md %}#using-changefeed-metrics-labels) option to track the `lagging_ranges` metric per changefeed.
You can use the [`metrics_label`]({% link {{ page.version.version }}/monitor-and-debug-changefeeds.md %}#using-changefeed-metrics-labels) option to track the `lagging_ranges` and `total_ranges` metric per changefeed.
{{site.data.alerts.end}}
1 change: 1 addition & 0 deletions src/current/v23.1/monitor-and-debug-changefeeds.md
Original file line number Diff line number Diff line change
Expand Up @@ -156,6 +156,7 @@ changefeed_emitted_bytes{scope="vehicles"} 183557
`error_retries` | Total retryable errors encountered by changefeeds. | Errors
`backfill_pending_ranges` | Number of [ranges]({% link {{ page.version.version }}/architecture/overview.md %}#architecture-range) in an ongoing backfill that are yet to be fully emitted. | Ranges
`message_size_hist` | Distribution in the size of emitted messages. | Bytes
<span class="version-tag">New in v23.1.29:</span> `total_ranges` | Total number of ranges that are watched by [aggregator processors]({% link {{ page.version.version }}/how-does-an-enterprise-changefeed-work.md %}) participating in the changefeed job. `changefeed.total_ranges` shares the same polling interval as the [`changefeed.lagging_ranges`](#lagging-ranges-metric) metric, which is controlled by the `changefeed.lagging_ranges_polling_interval` [cluster setting]({% link {{ page.version.version }}/cluster-settings.md %}). For more details, refer to [Lagging ranges](#lagging-ranges).

### Monitoring and measuring changefeed latency

Expand Down
1 change: 1 addition & 0 deletions src/current/v23.2/monitor-and-debug-changefeeds.md
Original file line number Diff line number Diff line change
Expand Up @@ -156,6 +156,7 @@ changefeed_emitted_bytes{scope="vehicles"} 183557
`error_retries` | Total retryable errors encountered by changefeeds. | Errors
`backfill_pending_ranges` | Number of [ranges]({% link {{ page.version.version }}/architecture/overview.md %}#architecture-range) in an ongoing backfill that are yet to be fully emitted. | Ranges
`message_size_hist` | Distribution in the size of emitted messages. | Bytes
<span class="version-tag">New in v23.2.13:</span> `total_ranges` | Total number of ranges that are watched by [aggregator processors]({% link {{ page.version.version }}/how-does-an-enterprise-changefeed-work.md %}) participating in the changefeed job. `changefeed.total_ranges` shares the same polling interval as the [`changefeed.lagging_ranges`](#lagging-ranges-metric) metric, which is controlled by the `lagging_ranges_polling_interval` option. For more details, refer to [Lagging ranges](#lagging-ranges).

### Monitoring and measuring changefeed latency

Expand Down
1 change: 1 addition & 0 deletions src/current/v24.1/monitor-and-debug-changefeeds.md
Original file line number Diff line number Diff line change
Expand Up @@ -153,6 +153,7 @@ changefeed_emitted_bytes{scope="vehicles"} 183557
`changefeed.message_size_hist` | Distribution in the size of emitted messages. | Bytes | Histogram
`changefeed.running` | Number of currently running changefeeds, including sinkless changefeeds. | Changefeeds | Gauge
`changefeed.sink_batch_hist_nanos` | Time messages spend batched in the sink buffer before being flushed and acknowledged. | Nanoseconds | Histogram
<span class="version-tag">New in v24.1.6:</span> `changefeed.total_ranges` | Total number of ranges that are watched by [aggregator processors]({% link {{ page.version.version }}/how-does-an-enterprise-changefeed-work.md %}) participating in the changefeed job. `changefeed.total_ranges` shares the same polling interval as the [`changefeed.lagging_ranges`](#lagging-ranges-metric) metric, which is controlled by the `lagging_ranges_polling_interval` option. For more details, refer to [Lagging ranges](#lagging-ranges).

### Monitoring and measuring changefeed latency

Expand Down
Loading

0 comments on commit 2a56ebd

Please sign in to comment.