DOC-923 Update tasks to compute units for resource allocation #207

asimms41 · 2025-02-20T12:54:53Z

Description

Resolves https://github.com/redpanda-data/documentation-private/issues/
Review deadline: 24th February

Page previews

This pull request includes changes to the documentation for managing pipeline resources. The changes focus on updating terminology and refining instructions to improve clarity and accuracy.

Key changes include:

Terminology updates:

Updated the term "tasks" to "compute units" throughout the documentation to better reflect the allocation of server resources. (modules/develop/pages/connect/configuration/resource-management.adoc, [1] [2] [3] [4] [5] [6]
Adjusted the maximum number of compute units from 18 to 15. (modules/develop/pages/connect/configuration/resource-management.adoc, [1] [2]

Instruction refinements:

Clarified the description and prerequisites for managing pipeline resources. (modules/develop/pages/connect/configuration/resource-management.adoc, modules/develop/pages/connect/configuration/resource-management.adocL1-R21)
Updated the instructions for setting initial resource limits and scaling resources to reflect the new terminology and refined limits. (modules/develop/pages/connect/configuration/resource-management.adoc, [1] [2]

Cross-references and examples:

Updated cross-references and examples to align with the new terminology and improved instructions. (modules/develop/pages/connect/connect-quickstart.adoc, [1] [2]

Checks

New feature
Content gap
Support Follow-up
Small fix (typos, links, copyedits, etc)

netlify · 2025-02-20T12:55:10Z

✅ Deploy Preview for rp-cloud ready!

Name	Link
🔨 Latest commit	`0ea628d`
🔍 Latest deploy log	https://app.netlify.com/sites/rp-cloud/deploys/67c09f4d81499800080d068b
😎 Deploy Preview	https://deploy-preview-207--rp-cloud.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

asimms41 · 2025-02-20T14:57:47Z

modules/develop/pages/connect/configuration/resource-management.adoc

 :description: Learn how to set an initial resource limit for a standard data pipeline (excluding Ollama AI components) and how to manually scale the pipeline’s resources to improve performance.
 :page-aliases: develop:connect/configuration/scale-pipelines.adoc

 {description}

 == Prerequisites

- A running xref:get-started:cluster-types/byoc/index.adoc[BYOC] (not BYOVPC) or xref:get-started:cluster-types/dedicated/create-dedicated-cloud-cluster.adoc[Dedicated cluster]
+- A running xref:get-started:cluster-types/byoc/index.adoc[BYOC] (not BYOVPC) or xref:get-started:cluster-types/dedicated/create-dedicated-cloud-cluster.adoc[Dedicated], xref:get-started:cluster-types/serverless-pro.adoc[Serverless Pro], or xref:get-started:cluster-types/serverless.adoc[Serverless Standard] cluster.


From our production env it looks like you can now scale pipeline resources for all cluster types.

I think serverless still keeps the value at 1, but the constraint will likely be removed

I tried both Serverless types in our test environments and they both seemed to scale (in terms of tasks/compute units). I guess I just need to know what you are going to release.

@nicolaferraro any reason why we cannot remove this restriction and allow Serverless user to configure up to 16 compute units (i.e. 1.6 cores and one instance)?

I've assumed that Serverless (Standard and Pro) will allow pipeline scaling.

asimms41 · 2025-02-20T14:59:37Z

modules/develop/pages/connect/configuration/resource-management.adoc

@@ -153,7 +155,7 @@ Cloud UI::
 . Log in to https://cloud.redpanda.com[Redpanda Cloud^].
 . Go to the cluster where the pipeline is set up.
 . On the **Connect** page, select your pipeline and click **Edit**.
-. In the **Tasks** box, update the number of tasks. One task provides a message throughput of approximately 1 MB/sec. For higher throughputs, you can allocate up to a maximum of 18 tasks per pipeline.
+. In the **Compute units** box, update the number of compute units. One compute unit provides a message throughput of approximately 1 MB/sec. For higher throughputs, you can allocate up to a maximum of 18 compute units per pipeline.


I can't see the updates to the UI yet in our prod or integration.

See above for max. Will make sure the UI is up to date

Updated max to 15 as in thread below until limits for larger pipelines are configurable. Although I could add up to 90 tasks through the UI?

modules/develop/pages/connect/configuration/resource-management.adoc

nicolaferraro · 2025-02-25T09:31:44Z

modules/develop/pages/connect/configuration/resource-management.adoc

-A task is a unit of computation that allocates a specific amount of CPU and memory to a data pipeline to handle message throughput. By default, each pipeline is allocated one task, which includes 0.1 CPU (100 milliCPU or `100m`) and 400 MB (`400M`) of memory, and provides a message throughput of approximately 1 MB/sec. You can allocate up to a maximum of 18 tasks per pipeline.
+A compute unit allocates a specific amount of server resources (CPU and memory) to a data pipeline to handle message throughput. By default, each pipeline is allocated one compute unit, which includes 0.1 CPU (100 milliCPU or `100m`) and 400 MB (`400M`) of memory, and provides a message throughput of up to 1 MB/sec. 
+
+Server resources are charged at an hourly rate in compute unit hours and you can allocate up to a maximum of 18 compute units per pipeline.


Maybe we can say that the maximum number of compute units varies by provider, it's no longer 18 after we did some tests.
It's 17 on AWS, 15 on GCP and 16 on Azure (don't think we need to state that explicitly).

@jrkinley - should we state the compute unit limit per provider or remove all mention of an upper limit?

@nicolaferraro wouldn't it be easier to set the same limit across all providers? and take the lowest one of 15? Then we can say 75% of the server resource is available for scheduling pipelines and 25% is reserved for system overhead (i.e. Kubernetes tax).

@asimms41 I'd caveat "up to 1MB/s" as every pipeline can and probably will be wildly different. See my description here: What-is-a-Compute-Unit?

We are currently increasing the maximum pipeline size, so that twice and 4 times as large pipelines will be able to run. The compute units for the larger pipelines are:

36 and 76 for AWS

33 and 72 for GCP

35 and 74 for Azure
Maybe we can say what the limit is more or less and that it depends on the provider? Not sure whether 15 vs 17 makes so much difference for users. The UI is going to validate in anyway.

I find it strange for the limits to be different. What was so different in testing to warrant a 200 millicore difference between providers?

Will all pipelines this large run on the 8-core instance sizes? i.e. we're not planning to use 4-core instance sizes for the 33-36 compute unit pipelines?

yes, we provision 4-core and 8-core instances in addition to the initial 2-core ones, so that we don't waste too much

we agreed with Nicola to go with the unified numbers across providers, which are:

15 compute units for 2-core machine

33 to run on 4-core

72 for 8-core

I think that makes sense. Thanks @tomasz-sadura @nicolaferraro.

Updated max to 15 until limits for larger pipelines are configurable.

nicolaferraro · 2025-02-25T09:32:15Z

modules/develop/pages/connect/configuration/resource-management.adoc

@@ -109,12 +111,12 @@ To set an initial resource limit:
 . Select the **Redpanda Connect** tab.
 . Click **Create pipeline**.
 . Enter details for your pipeline, including a short name and description.
-. In the **Tasks** box, leave the default **1** task to experiment with pipelines that create low message volumes. For higher throughputs, you can allocate up to a maximum of 18 tasks.
+. In the **Compute units** box, leave the default **1** compute unit to experiment with pipelines that create low message volumes. For higher throughputs, you can allocate up to a maximum of 18 compute units.


See above for the max

Updated max to 15 until limits for larger pipelines are configurable.

modules/develop/pages/connect/configuration/resource-management.adoc

nicolaferraro · 2025-02-25T09:37:25Z

modules/develop/pages/connect/configuration/resource-management.adoc

@@ -153,7 +155,7 @@ Cloud UI::
 . Log in to https://cloud.redpanda.com[Redpanda Cloud^].
 . Go to the cluster where the pipeline is set up.
 . On the **Connect** page, select your pipeline and click **Edit**.
-. In the **Tasks** box, update the number of tasks. One task provides a message throughput of approximately 1 MB/sec. For higher throughputs, you can allocate up to a maximum of 18 tasks per pipeline.
+. In the **Compute units** box, update the number of compute units. One compute unit provides a message throughput of approximately 1 MB/sec. For higher throughputs, you can allocate up to a maximum of 18 compute units per pipeline.


See above for max. Will make sure the UI is up to date

jrkinley · 2025-02-25T11:52:54Z

modules/develop/pages/connect/configuration/resource-management.adoc

 :description: Learn how to set an initial resource limit for a standard data pipeline (excluding Ollama AI components) and how to manually scale the pipeline’s resources to improve performance.
 :page-aliases: develop:connect/configuration/scale-pipelines.adoc

 {description}

 == Prerequisites

- A running xref:get-started:cluster-types/byoc/index.adoc[BYOC] (not BYOVPC) or xref:get-started:cluster-types/dedicated/create-dedicated-cloud-cluster.adoc[Dedicated cluster]
+- A running xref:get-started:cluster-types/byoc/index.adoc[BYOC] (not BYOVPC) or xref:get-started:cluster-types/dedicated/create-dedicated-cloud-cluster.adoc[Dedicated], xref:get-started:cluster-types/serverless-pro.adoc[Serverless Pro], or xref:get-started:cluster-types/serverless.adoc[Serverless Standard] cluster.


@nicolaferraro any reason why we cannot remove this restriction and allow Serverless user to configure up to 16 compute units (i.e. 1.6 cores and one instance)?

jrkinley · 2025-02-25T11:57:52Z

modules/develop/pages/connect/configuration/resource-management.adoc

-A task is a unit of computation that allocates a specific amount of CPU and memory to a data pipeline to handle message throughput. By default, each pipeline is allocated one task, which includes 0.1 CPU (100 milliCPU or `100m`) and 400 MB (`400M`) of memory, and provides a message throughput of approximately 1 MB/sec. You can allocate up to a maximum of 18 tasks per pipeline.
+A compute unit allocates a specific amount of server resources (CPU and memory) to a data pipeline to handle message throughput. By default, each pipeline is allocated one compute unit, which includes 0.1 CPU (100 milliCPU or `100m`) and 400 MB (`400M`) of memory, and provides a message throughput of up to 1 MB/sec. 
+
+Server resources are charged at an hourly rate in compute unit hours and you can allocate up to a maximum of 18 compute units per pipeline.


@nicolaferraro wouldn't it be easier to set the same limit across all providers? and take the lowest one of 15? Then we can say 75% of the server resource is available for scheduling pipelines and 25% is reserved for system overhead (i.e. Kubernetes tax).

jrkinley · 2025-02-25T12:00:17Z

modules/develop/pages/connect/configuration/resource-management.adoc

-A task is a unit of computation that allocates a specific amount of CPU and memory to a data pipeline to handle message throughput. By default, each pipeline is allocated one task, which includes 0.1 CPU (100 milliCPU or `100m`) and 400 MB (`400M`) of memory, and provides a message throughput of approximately 1 MB/sec. You can allocate up to a maximum of 18 tasks per pipeline.
+A compute unit allocates a specific amount of server resources (CPU and memory) to a data pipeline to handle message throughput. By default, each pipeline is allocated one compute unit, which includes 0.1 CPU (100 milliCPU or `100m`) and 400 MB (`400M`) of memory, and provides a message throughput of up to 1 MB/sec. 
+
+Server resources are charged at an hourly rate in compute unit hours and you can allocate up to a maximum of 18 compute units per pipeline.


@asimms41 I'd caveat "up to 1MB/s" as every pipeline can and probably will be wildly different. See my description here: What-is-a-Compute-Unit?

Update tasks to compute units

bde5250

asimms41 added 2 commits February 20, 2025 14:16

add related changes

2d86e2d

minor refinements

114b4e4

asimms41 commented Feb 20, 2025

View reviewed changes

modules/develop/pages/connect/configuration/resource-management.adoc Show resolved Hide resolved

asimms41 requested review from nicolaferraro and tomasz-sadura February 20, 2025 15:00

nicolaferraro reviewed Feb 25, 2025

View reviewed changes

asimms41 requested a review from jrkinley February 25, 2025 09:42

jrkinley requested changes Feb 25, 2025

View reviewed changes

Address review comments

0ea628d

asimms41 requested review from nicolaferraro and jrkinley February 27, 2025 17:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DOC-923 Update tasks to compute units for resource allocation #207

DOC-923 Update tasks to compute units for resource allocation #207

asimms41 commented Feb 20, 2025 •

edited

Loading

netlify bot commented Feb 20, 2025 •

edited

Loading

asimms41 Feb 20, 2025

nicolaferraro Feb 25, 2025 •

edited

Loading

asimms41 Feb 25, 2025

jrkinley Feb 25, 2025

asimms41 Feb 27, 2025

asimms41 Feb 20, 2025

nicolaferraro Feb 25, 2025

asimms41 Feb 27, 2025 •

edited

Loading

nicolaferraro Feb 25, 2025

asimms41 Feb 25, 2025

jrkinley Feb 25, 2025

jrkinley Feb 25, 2025

tomasz-sadura Feb 25, 2025

jrkinley Feb 25, 2025

tomasz-sadura Feb 26, 2025

tomasz-sadura Feb 27, 2025

jrkinley Feb 27, 2025

asimms41 Feb 27, 2025 •

edited

Loading

nicolaferraro Feb 25, 2025

asimms41 Feb 27, 2025

nicolaferraro Feb 25, 2025

jrkinley Feb 25, 2025

jrkinley Feb 25, 2025

jrkinley Feb 25, 2025

DOC-923 Update tasks to compute units for resource allocation #207

Are you sure you want to change the base?

DOC-923 Update tasks to compute units for resource allocation #207

Conversation

asimms41 commented Feb 20, 2025 • edited Loading

Description

Page previews

Terminology updates:

Instruction refinements:

Cross-references and examples:

Checks

netlify bot commented Feb 20, 2025 • edited Loading

✅ Deploy Preview for rp-cloud ready!

Choose a reason for hiding this comment

nicolaferraro Feb 25, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

asimms41 Feb 27, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

asimms41 Feb 27, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

asimms41 commented Feb 20, 2025 •

edited

Loading

netlify bot commented Feb 20, 2025 •

edited

Loading

nicolaferraro Feb 25, 2025 •

edited

Loading

asimms41 Feb 27, 2025 •

edited

Loading

asimms41 Feb 27, 2025 •

edited

Loading