-
Notifications
You must be signed in to change notification settings - Fork 8
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
ci: add CI to deploy stac data pipeline to k8s (#3459)
* feat: added stac data pipeline to be deployed to k8s * feat: added ci to bump version
- Loading branch information
1 parent
c4e0e1e
commit a204b3e
Showing
7 changed files
with
220 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,67 @@ | ||
name: Bump geo-undpstac-pipeline version | ||
on: | ||
# This workflow will be triggered when the new release tag is created on geohub-data-pipeline repository.concurrency: | ||
# https://github.com/UNDP-Data/geohub-data-pipeline/blob/main/.github/workflows/acr_docker_image.yml | ||
repository_dispatch: | ||
types: [bump-stacpipeline-version] | ||
workflow_dispatch: | ||
|
||
jobs: | ||
bump-version: | ||
runs-on: ubuntu-latest | ||
env: | ||
OWNER: undp-data | ||
REPO: geo-undpstac-pipeline | ||
steps: | ||
- name: checkout | ||
uses: actions/checkout@v4 | ||
with: | ||
ref: develop | ||
|
||
- name: get the latest version | ||
id: pipeline | ||
uses: pozetroninc/github-action-get-latest-release@master | ||
with: | ||
owner: ${{ env.OWNER }} | ||
repo: ${{ env.REPO }} | ||
excludes: prerelease, draft | ||
token: ${{ secrets.GITHUB_TOKEN }} | ||
|
||
- name: bump geo-undpstac-pipeline version | ||
working-directory: backends/k8s/stac-pipeline/yaml | ||
env: | ||
PIPELINE_VERSION: ${{ steps.pipeline.outputs.release }} | ||
YAML: deployment.yaml | ||
run: | | ||
echo "Latest release version: ${{ env.PIPELINE_VERSION}}" | ||
imagename="undpgeohub.azurecr.io/${{env.OWNER}}/${{ env.REPO }}" | ||
pattern="${imagename}:[^ ]*" | ||
sed "s|$pattern|$imagename:${{ env.PIPELINE_VERSION}}|g" ${{ env.YAML}} > temp.yaml | ||
# replace yaml file with new version | ||
mv temp.yaml ${{ env.YAML}} | ||
echo "tag version was replace to ${{ env.PIPELINE_VERSION}}" | ||
- name: Create Pull Request | ||
uses: peter-evans/create-pull-request@v6 | ||
with: | ||
branch: release/bump-geo-undpstac-pipeline | ||
title: "[RELEASE] bump version of geo-undpstac-pipeline" | ||
delete-branch: true | ||
commit-message: "[RELEASE] bump version of geo-undpstac-pipeline" | ||
body: | | ||
## Description | ||
This is going to bump the version ofgeo-undpstac-pipeline to apply the new pipeline docker image to kubernetes cluster | ||
--- | ||
- Auto-generated by [create-pull-request][1] | ||
[1]: https://github.com/peter-evans/create-pull-request | ||
labels: release | ||
reviewers: | | ||
iferencik | ||
Thuhaa | ||
JinIgarashi |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
# undp stac data pipeline | ||
|
||
[geo-undpstac-pipeline](https://github.com/UNDP-Data/geo-undpstac-pipeline) is a command line tool to ingest datasets and convert it into STAC items. This pipeline is deployed into Azure Kubernetes Service by using ScaledJob which is triggered by Azure Service Bus Queue Event. | ||
|
||
- [Namespace](#namespace) | ||
- [Installation](#installation) | ||
- [Uninstall](#uninstall) | ||
|
||
## Namespace | ||
|
||
The server lives in its namespace: **stac** and features 1 replicaset | ||
|
||
## Installation | ||
|
||
It requires to create a secret to store database connection string prior to apply `deployment.yaml` by kubectl command. | ||
|
||
```shell | ||
cd scripts | ||
cp .env.example .env | ||
# set environmental variables in .env | ||
./install.sh | ||
``` | ||
|
||
The above command will create the following environment | ||
|
||
- namespace | ||
- scaledjob | ||
|
||
## Uninstall | ||
|
||
To uninstall use the same yaml files i opposite order | ||
|
||
``` | ||
cd scripts | ||
./uninstall.sh | ||
``` | ||
|
||
## Notes | ||
|
||
For processing night time light data of 6 January 2024, it took around 12 minutes time with 20GB RAM allocated pod. `activeDeadlineSeconds` is set to 3600 seconds (1 hour). Thus, the container will automatically stop after 1 hour passes. If the job finished before 1 hour, the container will automatically stop and delete it. | ||
|
||
This scaled job is deployed to `manual` node pool which can autoscale up to 2 nodes. Currently, all pods for titiler and titiler-dev can run within a node. When a message is added into the queue, the resource is not enough to launch scaled job. Then k8s will scale up to 2 nodes. Once the pipeline job finished, the second node will be deleted automatically after some time (probably around 15 - 20 minutes). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
AZURE_STORAGE_CONNECTION_STRING= | ||
AZURE_SERVICE_BUS_CONNECTION_STRING= | ||
AZURE_SERVICE_BUS_QUEUE_NAME=undp-stac-pipeline |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
#!/bin/bash | ||
|
||
NAMESPACE=stac | ||
SECRET_NAME=stac-secrets | ||
|
||
# Source the .env file located in the same directory as the script | ||
. .env | ||
# Rest of the script | ||
kubectl apply -f ../yaml/deployment.yaml | ||
# create secret with environmental variables | ||
kubectl create secret generic $SECRET_NAME \ | ||
--from-literal=AZURE_STORAGE_CONNECTION_STRING=$AZURE_STORAGE_CONNECTION_STRING \ | ||
--from-literal=AZURE_SERVICE_BUS_CONNECTION_STRING=$AZURE_SERVICE_BUS_CONNECTION_STRING \ | ||
--from-literal=AZURE_SERVICE_BUS_QUEUE_NAME=$AZURE_SERVICE_BUS_QUEUE_NAME \ | ||
-n $NAMESPACE | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
kubectl delete secret stac-secrets --ignore-not-found -n stac | ||
kubectl delete -f ../yaml/deployment.yaml | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,55 @@ | ||
--- | ||
apiVersion: v1 | ||
kind: Namespace | ||
metadata: | ||
name: stac | ||
--- | ||
apiVersion: keda.sh/v1alpha1 | ||
kind: ScaledJob | ||
metadata: | ||
name: stac-scaledjob | ||
namespace: stac | ||
spec: | ||
jobTargetRef: | ||
parallelism: 1 | ||
completions: 1 | ||
activeDeadlineSeconds: 3600 | ||
backoffLimit: 5 | ||
template: | ||
spec: | ||
nodeSelector: | ||
type: "manual" | ||
containers: | ||
- name: stac | ||
image: undpgeohub.azurecr.io/undp-data/geo-undpstac-pipeline:v0.0.1 | ||
imagePullPolicy: Always | ||
command: ["python3"] | ||
args: ["-m", "undpstac_pipeline.cli", "queue"] | ||
resources: | ||
limits: | ||
memory: "20G" | ||
cpu: "2000m" | ||
envFrom: | ||
- secretRef: | ||
name: stac-secrets | ||
optional: false | ||
restartPolicy: Never | ||
pollingInterval: 30 # Optional. Default: 30 seconds | ||
successfulJobsHistoryLimit: 0 # Optional. Default: 100. How many completed jobs should be kept. | ||
failedJobsHistoryLimit: 0 # Optional. Default: 100. How many failed jobs should be kept. | ||
envSourceContainerName: stac # Optional. Default: .spec.JobTargetRef.template.spec.containers[0] | ||
minReplicaCount: 0 # Optional. Default: 0 | ||
maxReplicaCount: 1 # Optional. Default: 100 | ||
rollout: | ||
strategy: gradual # Optional. Default: default. Which Rollout Strategy KEDA will use. | ||
propagationPolicy: foreground # Optional. Default: background. Kubernetes propagation policy for cleaning up existing jobs during rollout. | ||
scalingStrategy: | ||
strategy: default | ||
triggers: | ||
- type: azure-servicebus | ||
metadata: | ||
queueName: undp-stac-pipeline | ||
namespace: undpgeohub | ||
messageCount: "1" # default 5, scale/spin a pod for every message | ||
activationMessageCount: "0" # default 0, ensure no pods exist if no messages exist in the queue | ||
connectionFromEnv: AZURE_SERVICE_BUS_CONNECTION_STRING |