Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC-923 Update tasks to compute units for resource allocation #207

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,20 +1,24 @@
= Manage Pipeline Resources on BYOC and Dedicated Clusters
= Manage Pipeline Resources
:description: Learn how to set an initial resource limit for a standard data pipeline (excluding Ollama AI components) and how to manually scale the pipeline’s resources to improve performance.
:page-aliases: develop:connect/configuration/scale-pipelines.adoc

{description}

== Prerequisites

- A running xref:get-started:cluster-types/byoc/index.adoc[BYOC] (not BYOVPC) or xref:get-started:cluster-types/dedicated/create-dedicated-cloud-cluster.adoc[Dedicated cluster]
- A running xref:get-started:cluster-types/byoc/index.adoc[BYOC] (not BYOVPC), xref:get-started:cluster-types/dedicated/create-dedicated-cloud-cluster.adoc[Dedicated], xref:get-started:cluster-types/serverless-pro.adoc[Serverless Pro] or xref:get-started:cluster-types/serverless.adoc[Serverless Standard] cluster.
- An estimate of the throughput of your data pipeline. You can get some basic statistics by running your data pipeline locally using the xref:redpanda-connect:components:processors/benchmark.adoc[`benchmark` processor].

=== Understanding tasks
=== Understanding compute units

A task is a unit of computation that allocates a specific amount of CPU and memory to a data pipeline to handle message throughput. By default, each pipeline is allocated one task, which includes 0.1 CPU (100 milliCPU or `100m`) and 400 MB (`400M`) of memory, and provides a message throughput of approximately 1 MB/sec. You can allocate up to a maximum of 18 tasks per pipeline.
A compute unit allocates a specific amount of server resources (CPU and memory) to a data pipeline to handle message throughput. By default, each pipeline is allocated one compute unit, which includes 0.1 CPU (100 milliCPU or `100m`) and 400 MB (`400M`) of memory.

For sizing purposes, one compute unit supports an estimated message throughput of 1 MB/sec. However, actual performance depends on the complexity of a pipeline, including the components it contains and processing it does.

Server resources are charged at an hourly rate in compute unit hours, and you can allocate up to a maximum of 15 compute units per pipeline.

|===
| Number of Tasks | CPU | Memory
| Number of compute units | CPU | Memory

| 1
| 0.1 CPU (`100m`)
Expand Down Expand Up @@ -76,25 +80,13 @@ A task is a unit of computation that allocates a specific amount of CPU and memo
| 1.5 CPU (`1500m`)
| 6.0 GB (`6000M`)

| 16
| 1.6 CPU (`1600m`)
| 6.4 GB (`6400M`)

| 17
| 1.7 CPU (`1700m`)
| 6.8 GB (`6800M`)

| 18
| 1.8 CPU (`1800m`)
| 7.2 GB (`7200M`)

|===

NOTE: For pipelines with embedded Ollama AI components, one GPU task is automatically allocated to the pipeline, which is equivalent to 30 tasks or 3.0 CPU (`3000m`) and 12 GB of memory (`12000M`).
NOTE: For pipelines with embedded Ollama AI components, one GPU is automatically allocated to the pipeline, which is equivalent to 30 compute units, or 3.0 CPU (`3000m`) and 12 GB of memory (`12000M`).

=== Set an initial resource limit

When you create a data pipeline, you can allocate a fixed amount of compute resources to it using tasks.
When you create a data pipeline, you can allocate a fixed amount of server resources to it using compute units.

[NOTE]
====
Expand All @@ -109,12 +101,12 @@ To set an initial resource limit:
. Select the **Redpanda Connect** tab.
. Click **Create pipeline**.
. Enter details for your pipeline, including a short name and description.
. In the **Tasks** box, leave the default **1** task to experiment with pipelines that create low message volumes. For higher throughputs, you can allocate up to a maximum of 18 tasks.
. In the **Compute units** box, leave the default **1** compute unit to experiment with pipelines that create low message volumes. For higher throughputs, you can allocate up to a maximum of 15 compute units.
. Add your pipeline configuration and click **Create** to run it.

=== Scale resources

View the compute resources allocated to a data pipeline, and manually scale those resources to improve performance or decrease resource consumption.
View the server resources allocated to a data pipeline, and manually scale those resources to improve performance or decrease resource consumption.

To view resources already allocated to a data pipeline:

Expand All @@ -127,8 +119,8 @@ Cloud UI::
. Go to the cluster where the pipeline is set up.
. On the **Connect** page, select your pipeline and look at the value for **Resources**.
+
* CPU resources are displayed first, in milliCPU. For example, `1` task is `100m` or 0.1 CPU.
* Memory is displayed next in megabytes. For example, `1` task is `400M` or 400 MB.
* CPU resources are displayed first, in milliCPU. For example, `1` compute unit is `100m` or 0.1 CPU.
* Memory is displayed next in megabytes. For example, `1` compute unit is `400M` or 400 MB.

--
Data Plane API::
Expand All @@ -137,8 +129,8 @@ Data Plane API::
. xref:manage:api/cloud-api-quickstart.adoc#try-the-cloud-api[Authenticate and get the base URL] for the Data Plane API.
. Make a request to xref:api:ROOT:cloud-dataplane-api.adoc#get-/v1alpha2/redpanda-connect/pipelines[`GET /v1alpha2/redpanda-connect/pipelines`], which lists details of all pipelines on your cluster by ID.
+
* Memory (`memory_shares`) is displayed in megabytes. For example, `1` task is `400M` or 400 MB.
* CPU resources (`cpu_shares`) are displayed milliCPU. For example, `1` task is `100m` or 0.1 CPU.
* Memory (`memory_shares`) is displayed in megabytes. For example, `1` compute unit is `400M` or 400 MB.
* CPU resources (`cpu_shares`) are displayed milliCPU. For example, `1` compute unit is `100m` or 0.1 CPU.

--
=====
Expand All @@ -153,7 +145,7 @@ Cloud UI::
. Log in to https://cloud.redpanda.com[Redpanda Cloud^].
. Go to the cluster where the pipeline is set up.
. On the **Connect** page, select your pipeline and click **Edit**.
. In the **Tasks** box, update the number of tasks. One task provides a message throughput of approximately 1 MB/sec. For higher throughputs, you can allocate up to a maximum of 18 tasks per pipeline.
. In the **Compute units** box, update the number of compute units. One compute unit provides a message throughput of approximately 1 MB/sec. For higher throughputs, you can allocate up to a maximum of 15 compute units per pipeline.
. Click **Update** to apply your changes. The specified resources are available immediately.

--
Expand Down
4 changes: 2 additions & 2 deletions modules/develop/pages/connect/connect-quickstart.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@ All Redpanda Connect configurations use a YAML file split into three sections:

. Go to the **Connect** page on your cluster and click **Create pipeline**.
. In **Pipeline name**, enter **emailprocessor-pipeline** and add a short description. For example, **Transforms email data using a mutation processor**.
. In the **Tasks** box, leave the default value of **1**. Tasks are used to allocate resources to a pipeline. One task is equivalent to 0.1 CPU and 400 MB of memory, and provides a message throughput of approximately 1 MB/sec.
. In the **Compute units** box, leave the default value of **1**. Compute units are used to allocate server resources to a pipeline. One compute unit is equivalent to 0.1 CPU and 400 MB of memory.
. In the **Configuration** box, paste the following configuration.

+
Expand Down Expand Up @@ -249,5 +249,5 @@ When you've finished experimenting with your data pipeline, you can delete the p
* Choose xref:develop:connect/components/catalog.adoc[connectors for your use case].
* Learn how to xref:develop:connect/configuration/secret-management.adoc[add secrets to your pipeline].
* Learn how to xref:develop:connect/configuration/monitor-connect.adoc[monitor a data pipeline on a BYOC or Dedicated cluster].
* Learn how to xref:develop:connect/configuration/scale-pipelines.adoc[manually scale resources for a pipeline on a BYOC or Dedicated cluster].
* Learn how to xref:develop:connect/configuration/scale-pipelines.adoc[manually scale resources for a pipeline].
* Learn how to xref:redpanda-connect:guides:getting_started.adoc[configure, test, and run a data pipeline locally].