Skip to content

Commit

Permalink
Create CronJob to run stats-pipeline on GKE (#52)
Browse files Browse the repository at this point in the history
* Add CronJob to run the maptiles-runner

* Fix cloudbuild.yaml

* Rename runner.yaml

* Update cloudbuild.yaml

* Move concurrencyPolicy

* Fix env variables

* update deployment

* Move concurrencyPolicy to the right place.

* Set schedule from ENV variable.

* Add restartPolicy: Never

* Rename maptiles to stats-pipeline-runner

* Rename -runner to -cronjob

* Fix yaml name

* Update run-pipeline

* Remove extra \

* Rename _TAG to _DOCKER_TAG

* Rename PIPELINE_SCHEDULE to PIPELINE_CRON_SCHEDULE

* Rename to .template

* Add comments

* Added comment
  • Loading branch information
robertodauria authored Feb 25, 2021
1 parent 4362b42 commit 4bafa31
Show file tree
Hide file tree
Showing 4 changed files with 69 additions and 8 deletions.
42 changes: 38 additions & 4 deletions cloudbuild.yaml
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
steps:
- name: "gcr.io/cloud-builders/docker"
id: "Build the docker container"
args: ["build", "-t", "gcr.io/$PROJECT_ID/stats-pipeline:$_TAG", "."]
args: ["build", "-t", "gcr.io/$PROJECT_ID/stats-pipeline:$_DOCKER_TAG", "."]

- name: "gcr.io/cloud-builders/docker"
id: "Push the docker container to gcr.io"
args: ["push", "gcr.io/$PROJECT_ID/stats-pipeline:$_TAG"]
args: ["push", "gcr.io/$PROJECT_ID/stats-pipeline:$_DOCKER_TAG"]

- name: "gcr.io/cloud-builders/kubectl"
id: "Create configmap manifest"
Expand Down Expand Up @@ -34,17 +34,20 @@ steps:
- -c
- |
sed 's/{{GCLOUD_PROJECT}}/${PROJECT_ID}/g' \
k8s/data-processing/deployments/stats-pipeline.yaml > \
k8s/data-processing/deployments/stats-pipeline.yaml.template > \
manifest.yaml
- name: "gcr.io/cloud-builders/gke-deploy"
id: "Create stats-pipeline deployment"
args:
- run
- --filename=manifest.yaml
- --image=gcr.io/$PROJECT_ID/stats-pipeline:$_TAG
- --image=gcr.io/$PROJECT_ID/stats-pipeline:$_DOCKER_TAG
- --location=$_COMPUTE_REGION
- --cluster=$_CLUSTER
# gke-deploy will fail if the output folder is non-empty, thus we use
# different folders for the two executions of this tool.
- --output=pipeline/

- name: "gcr.io/cloud-builders/kubectl"
id: "Create stats-pipeline service"
Expand All @@ -55,3 +58,34 @@ steps:
env:
- CLOUDSDK_COMPUTE_REGION=$_COMPUTE_REGION
- CLOUDSDK_CONTAINER_CLUSTER=$_CLUSTER

- name: "gcr.io/cloud-builders/docker"
id: "Build the stats-pipeline-runner docker container"
args: ["build", "-t", "gcr.io/$PROJECT_ID/stats-pipeline-runner:$_DOCKER_TAG", "maptiles/"]

- name: "gcr.io/cloud-builders/docker"
id: "Push the stats-pipeline-runner docker container to gcr.io"
args: ["push", "gcr.io/$PROJECT_ID/stats-pipeline-runner:$_DOCKER_TAG"]

- name: "gcr.io/cloud-builders/gcloud"
id: "Generate manifest for the stats-pipeline-cronjob"
entrypoint: /bin/sh
args:
- -c
- |
sed -e 's/{{GCLOUD_PROJECT}}/${PROJECT_ID}/g' \
-e "s/{{PIPELINE_CRON_SCHEDULE}}/${_PIPELINE_CRON_SCHEDULE}/g" \
k8s/data-processing/jobs/stats-pipeline-cronjob.yaml.template > \
stats-pipeline-cronjob.yaml
- name: "gcr.io/cloud-builders/gke-deploy"
id: "Create runner CronJob"
args:
- run
- --filename=stats-pipeline-cronjob.yaml
- --image=gcr.io/$PROJECT_ID/stats-pipeline-runner:$_DOCKER_TAG
- --location=$_COMPUTE_REGION
- --cluster=$_CLUSTER
# gke-deploy will fail if the output folder is non-empty, thus we use
# different folders for the two executions of this tool.
- --output=runner/
24 changes: 24 additions & 0 deletions k8s/data-processing/jobs/stats-pipeline-cronjob.yaml.template
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# cronjob.yaml
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: stats-pipeline-cronjob
spec:
schedule: "{{PIPELINE_CRON_SCHEDULE}}"
concurrencyPolicy: Forbid
jobTemplate:
spec:
template:
spec:
restartPolicy: Never
containers:
- name: maptiles-runner
# The exact image to be deployed is replaced by gke-deploy, this is
# a placeholder
image: gcr.io/{{GCLOUD_PROJECT}}/stats-pipeline-runner
args:
- /bin/bash
- run-pipeline.sh
env:
- name: PROJECT
value: {{GCLOUD_PROJECT}}
11 changes: 7 additions & 4 deletions maptiles/run-pipeline.sh
Original file line number Diff line number Diff line change
Expand Up @@ -9,11 +9,14 @@ PROJECT=${PROJECT:?Please provide project}
# Start stats-pipeline for the current year
year=$(date +%Y)

if ! curl -X POST "http://stats-pipeline-service:8080/v0/pipeline?year=${year}"; then
if ! curl -X POST "http://stats-pipeline-service:8080/v0/pipeline?year=${year}&step=all"; then
echo "Stats-pipeline failed, please check the container logs."
exit 1
fi

echo "Stats-pipeline completed successfully, generating maptiles..."
export GCS_BUCKET=maptiles-${PROJECT}
make piecewise
echo "Stats-pipeline completed successfully"
# Note: this is disabled until the maptiles generation can run on multiple
# years. Currently, 2020 is hardcoded and it would be pointless to regenerate
# the maptiles every time the stats-pipeline runs.
#export GCS_BUCKET=maptiles-${PROJECT}
#make piecewise

0 comments on commit 4bafa31

Please sign in to comment.