RudderStack is a customer data pipeline tool for collecting, routing and processing data from your websites, apps, cloud tools, and data warehouse.
More information on RudderStack can be found here.
$ git clone [email protected]:rudderlabs/rudderstack-helm.git
$ cd rudderstack-helm/charts/rudderstack
$ helm dependency build
$ helm install my-release ./ --set rudderWorkspaceToken="<workspace token from the dashboard>"
The RudderStack Helm chart creates a Rudderstack deployment on a Kubernetes cluster using the Helm package manager.
- Kubectl installed and connected to your kubernetes cluster
- Helm installed
- Workspace token from the RudderStack dashboard. Set up your account and copy your workspace token from the top of the home page.
To install the chart with the release name my-release
, from the root directory of this repo:
$ helm install my-release ./ --set rudderWorkspaceToken="<workspace token from the dashboard>"
The command deploys Rudderstack on the default Kubernetes cluster configured with kubectl
.
The configuration section lists the most significant parameters that can be configured during
deployment.
To update configuration or version of the images used, change the configuration and run:
$ helm upgrade my-release ./ --set rudderWorkspaceToken="<workspace token from the dashboard>"
To uninstall/delete the my-release
deployment:
$ helm uninstall my-release
This removes all the components created by this chart.
To run a dry-run to evaluate if the changes proposed would be applied properly we can execute:
helm template ./ | kubectl apply --dry-run=client -f -
We contemplate three options on having Postgres as a dependency.
- Deploying it as a Sidecar in the same stateful resource
- Deploying a new Statefulset with Postgres.
- Providing an external Postgres.
To enable the sidecar mode, specify:
postgresql:
mode: sidecar
statefulset_enabled: false
To enable the sidecar mode, specify:
postgresql:
mode: statefulset
statefulset_enabled: true
Only recommended with postgresql sidecar mode enable.
Currently, only supported for
backend.controlPlaneJSON:true
since the pre-stop hook reads from the local config guaranteeing that all the events reached the destination so no event is lost on the autoscaling down process.
Horizontal Pod Autoscaling is available in case of resource efficiency requirement.
backend:
terminationGracePeriodSeconds: xx
lifecycleSleepTime: xx
hpa:
enabled: true
Also, make sure you define the lifecycleSleepTime
& the terminationGracePeriodSeconds
bigger
than BatchRouter.uploadFreqInS
otherwise K8s will kill the pods before flushing the data into their destinations.
If you are using open-source config-generator UI, you need to set the parameter controlPlaneJSON
to true
in
the values.yaml
file. Export workspace-config from the config-generator and copy/paste the contents into
the workspaceConfig.json
file.
$ helm install my-release ./ --set backend.controlPlaneJSON=true
Since we are publishing the Chart under the {{ TBC by the RudderStack team }} page. It's possible to extend this Chart by adding it as a dependency into your own Chart, so there is no need to git clone this repo for deploying RudderStack open-source into your infrastructure.
apiVersion: v2
name: rudderstack
description: Customer Data Pipeline tool for collecting, routing and processing data.
maintainers:
- name: Data Platform
email: [email protected]
version: 0.4.5
appVersion: 1.16.0
dependencies:
# https://github.com/rudderlabs/rudderstack-helm
- name: rudderstack
version: 0.4.5
repository: https://TBC.github.io/rudderstack-helm # To Be Confirmed by the RudderStack team
If you are using Google Cloud Storage or Google BigQuery for the following cases, you have to replace the contents of the file rudder-google-application-credentials.json with your service account:
- GCS as a destination
- GCS for dumping jobs
- BigQuery as a warehouse destination.
The following table lists the configurable parameters of the Rudderstack chart and their default values.
Parameter | Description | Default |
---|---|---|
rudderWorkspaceToken |
Workspace token from the dashboard | - |
backend.image.repository |
Container image repository for the backend | rudderlabs/rudder-server |
backend.image.version |
Container image tag for the backend. Available versions | v0.1.6 |
backend.image.pullPolicy |
Container image pull policy for the backend image | Always |
transformer.image.repository |
Container image repository for the transformer | rudderlabs/transformer |
transformer.image.version |
Container image tag for the transformer. Available versions | v0.1.2 |
transformer.image.pullPolicy |
Container image pull policy for the transformer image | Always |
backend.extraEnvVars |
Extra environments variables to be used by the backend in the deployments | Refer values.yaml file |
backend.controlPlaneJSON |
If true , backend will read config from the workspaceConfig.json file |
false |
Each of these parameters can be changed in values.yaml
. Or specify each parameter using
the --set key=value[,key=value]
argument to helm install
. For example:
$ helm install --name my-release \
--set backend.image.version=v0.1.6 \
./
Note: Configuration specific to:
- Backend can be edited in rudder-config.yaml.
- PostgreSQL can be edited in
pg_hba.conf
,postgresql.conf
Installing this Helm chart will deploy the following pods and containers in the configured cluster:
- rudderstack-backend
- rudderstack-telegraf-sidecar
- rudderstack-postgresql-sidecar
- transformer
For any queries related to using the RudderStack Helm Chart, feel free to start a conversation on our Slack channel.