Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(): add cost tag for cronjobs #149

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

dsfwoo8172
Copy link

@dsfwoo8172 dsfwoo8172 commented Apr 26, 2024

helm diff upgrade google-feed-job ./cronjob \
-f sr_script/helm/preview/cron-workflows-common.yaml
-f sr_script/helm/preview/cron-workflows/google-feed-job.yaml
--set-string image.tag=3.237.0
--set-string exitNotifications.healthcheckIo.uuid=11b635d9-a389-4a76-a51a-f9d28c9daab3
--namespace core-legacy-po-cron

core-legacy-po-cron, google-feed-job, CronWorkflow (argoproj.io) has changed:
  # Source: cronjob/templates/cronjob.yaml
  # The "apiVersion" and "kind" for "Argo cron workflow"
  apiVersion: argoproj.io/v1alpha1
  kind: CronWorkflow
  metadata:
    name: google-feed-job
    namespace: core-legacy-po-cron
+   labels:
+     businessid: ""
  spec:
    schedule: "0 1,7,13,19 * * *"
    # If "kind" eq to "CronWorkflow", timezone can be set. If "kind" eq to "CronWorkflow" and timezone are not set, "Asia/Taipei" will be used
    timezone: Etc/UTC
    successfulJobsHistoryLimit: 1
    failedJobsHistoryLimit: 1
    concurrencyPolicy: Forbid
    suspend: false # Set to "true" to suspend scheduling
    # If "startingDeadlineSeconds" is set, the pod will dead after the second is set. Otherwise, the pod will dead ONLY the job is done
    # If "kind" eq to "CronWorkflow", argo cron workflow template will be loaded, otherwise, k8s cronjob template will be loaded
    workflowSpec:
      podMetadata:
        labels:
          name: google-feed-job
+         businessid: ""
        annotations:
          "instrumentation.opentelemetry.io/container-names" : "main"
          "instrumentation.opentelemetry.io/inject-sdk" : "true"
          "karpenter.sh/do-not-disrupt" : "true"
      workflowMetadata:
        labels:
          name: google-feed-job
+         businessid: ""
        annotations:
          "instrumentation.opentelemetry.io/container-names" : "main"
          "instrumentation.opentelemetry.io/inject-sdk" : "true"
          "karpenter.sh/do-not-disrupt" : "true"
      nodeSelector:
        karpenter.sh/capacity-type: on-demand
      # If .Values.job.timeout equal to null, the pod will be kill ONLY the job is done. Otherwise, the pod will kill after the value you set
      metrics:
        prometheus:
          # Metric name (will be prepended with "argo_workflows_")
          - name: cron_workflow_exec_duration_gauge
          # Labels are optional. Avoid cardinality explosion.
            labels:
              - key: name
                value: "{{workflow.labels.name}}"
              - key: namespace
                value: "{{workflow.namespace}}"
            # A help doc describing your metric. This is required.
            help: "Duration gauge by name"
            # The metric type. Available are "gauge", "histogram", and "counter".
            gauge:
            # The value of your metric. It could be an Argo variable (see variables doc) or a literal value
              value: "{{workflow.duration}}"
              realtime: true
          - name: cron_workflow_fail_count
            labels:
              - key: name
                value: "{{workflow.labels.name}}"
              - key: namespace
                value: "{{workflow.namespace}}"
            help: "Count of execution by fail status"
            # Emit the metric conditionally. Works the same as normal "when"
            when: "{{status}} != Succeeded"
            counter:
              # This increments the counter by 1
              value: "1"
          - name: cron_workflow_success_count
            labels:
              - key: name
                value: "{{workflow.labels.name}}"
              - key: namespace
                value: "{{workflow.namespace}}"
            help: "Count of execution by success status"
            # Emit the metric conditionally. Works the same as normal "when"
            when: "{{status}} == Succeeded"
            counter:
              # This increments the counter by 1
              value: "1"
      entrypoint: entry
      # If not exitNotifications config is set, the default exit-handler of the argo server will be used
      onExit: exit-handler
      podDisruptionBudget:
        # Documentation: https://argoproj.github.io/argo-workflows/fields/#poddisruptionbudgetspec
        # Provide arbitrary big number if you don't know how many pods workflow creates
        minAvailable: 9999
      templates:
        - name: entry
          steps:
            - - name: step1
                template: template
        - name: template
          metadata:
            namespace: core-legacy-po-cron
          container:
            image: '332947256684.dkr.ecr.ap-southeast-1.amazonaws.com/sr_script:3.237.0'
            # The command to call the function of the image
            command:
              - bundle
              - exec
              - rails
              - runner
              - /srv/www/api/sr-script/argo-cronjobs/google-feed-job.rb
            # The resource will be apply if "resource is set"
            resources:
              limits:
                cpu: 1
                memory: 6Gi
              requests:
                cpu: 1
                memory: 6Gi
            env:
              - name: POD_NAME
                value: google-feed-job
              - name: ARGO_WORKFLOW_NAME
                value: "{{workflow.name}}"
              - name: ENABLE_RAKE_OTEL_TRACING
                value: "true"
              - name: ENABLE_SL_EVENT_THREAD_POOL
                value: "true"
              - name: JOB_NAME
                value: "google-feed-job"
              - name: NEW_RELIC_INSTRUMENTATION_CONCURRENT_RUBY
                value: "disabled"
              - name: OPENTELEMETRY_SERVICE_NAME
                value: "api.shoplineapp.com-ArgoCronWorkflow (Preview)"
              - name: RAILS_LOG_TO_STDOUT
                value: "true"
            # Apply .Values.envFrom if it is set
            envFrom:
              - configMapRef:
                  name: api-env
              - configMapRef:
                  name: api-custom-env
              - secretRef:
                  name: eso-api-env
        # The template of exist-handler if any .Values.exitNotifications config is set
        - name: exit-handler
          steps:
          - - name: Success
              template: success-handler
              when: "{{workflow.status}} == Succeeded"
            - name: Failure
              template: failure-handler
              when: "{{workflow.status}} != Succeeded"
        # The template of steps will go through if the job is done successfully
        - name: success-handler
          steps:
          -
            # If .Values.exitNotifications.slackApp is set, slackApp will be notify if the job is done
            - name: Notice-SlackApp-Succeeded
              template: notice-slack-app-succeeded
            # If .Values.exitNotifications.healthcheckIo is set, Healthcheck IO will be notify if the job is done
            - name: Notice-HealthcheckIo-Succeeded
              template: notice-healthcheck-io-succeeded
        # The template of steps will go through if the job is failed
        - name: failure-handler
          steps:
          -
            # If .Values.exitNotifications.slackApp is set, slackApp will be notify if the job is failed
            - name: Notice-SlackApp-Failed
              template: notice-slack-app-failed
            # If .Values.exitNotifications.newRelic is set, New Relic will be notify if the job is failed
            - name: Notice-NewRelic-Failed
              template: notice-newrelic-failed
            # If .Values.exitNotifications.newRelic is set, New Relic will be notify if the job is failed
            - name: Notice-HealthcheckIo-Failed
              template: notice-healthcheck-io-failed
        # If .Values.exitNotifications.slackApp is set, Slack app notification template will be loaded
        - name: notice-slack-app-succeeded
          container:
            image: 332947256684.dkr.ecr.ap-southeast-1.amazonaws.com/curlimages/curl:8.4.0
            command: [sh, -c]
            args: [
              "curl -X POST --tlsv1.2 --retry 3 --retry-all-errors --fail -H 'Content-type: application/json' --data '{\"attachments\": [
                {
                  \"fallback\": \"Workflow Succeeded - {{workflow.name}}\",
                  \"color\": \"#18be52\",
                  \"blocks\": [
                    {
                      \"type\": \"header\",
                      \"text\": {
                        \"type\": \"plain_text\",
                        \"text\": \"Workflow Succeeded - {{workflow.name}}\",
                        \"emoji\": true
                      }
                    },
                    {
                      \"type\": \"divider\"
                    },
                    {
                      \"type\": \"section\",
                      \"fields\": [
                        {
                          \"type\": \"mrkdwn\",
                          \"text\": \"*Cluster*\\nec-eks-db-preview\"
                        },
                        {
                          \"type\": \"mrkdwn\",
                          \"text\": \"*Namespace*\\n{{workflow.namespace}}\"
                        },
                        {
                          \"type\": \"mrkdwn\",
                          \"text\": \"*Scheduled Time*\\n{{workflow.scheduledTime}}\"
                        },
                        {
                          \"type\": \"mrkdwn\",
                          \"text\": \"*Duration*\\n{{workflow.duration}} sec\"
                        }
                      ]
                    }
                    ,
                    {
                      \"type\": \"actions\",
                      \"elements\": [
                        {
                          \"type\": \"button\",
                          \"text\": {
                            \"type\": \"plain_text\",
                            \"text\": \"Argo Dashboard\"
                          },
                          \"url\": \"https://argo-workflows-preview.shopline.io/workflows/{{workflow.namespace}}/{{workflow.name}}?tab=workflow\"
                        }
                        ,{
                          \"type\": \"button\",
                          \"text\": {
                            \"type\": \"plain_text\",
                            \"text\": \"App Logs\"
                          },
                          \"url\": \"https://grafana-preview.shopline.io/explore?orgId=1&left=%7B%22datasource%22:%22loki%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22loki%22%7D,%22editorMode%22:%22builder%22,%22expr%22:%22%7Bcluster%3D%5C%22ec-eks-db%5C%22,%20namespace%3D%5C%22{{workflow.namespace}}%5C%22,%20pod%3D~%5C%22{{workflow.name}}.%2A%5C%22%7D%20%7C%3D%20%60%60%22,%22queryType%22:%22range%22%7D%5D,%22range%22:%7B%22from%22:%22{{=sprig.mul(workflow.creationTimestamp.s,1000)}}%22,%22to%22:%22{{=sprig.add(sprig.mul(workflow.creationTimestamp.s,1000),sprig.mul(sprig.ceil(workflow.duration),1500))}}%22%7D%7D\"
                        }
                        ,{
                          \"type\": \"button\",
                          \"text\": {
                            \"type\": \"plain_text\",
                            \"text\": \"K8s Events\"
                          },
                          \"url\": \"https://grafana-preview.shopline.io/explore?orgId=1&left=%7B%22datasource%22:%22loki%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22loki%22%7D,%22editorMode%22:%22builder%22,%22expr%22:%22%7Bapp%3D%5C%22kubernetes-event-exporter%5C%22%7D%20%7C%3D%20%60{{workflow.name}}%60%22,%22queryType%22:%22range%22%7D%5D,%22range%22:%7B%22from%22:%22{{=sprig.mul(workflow.creationTimestamp.s,1000)}}%22,%22to%22:%22{{=sprig.add(sprig.mul(workflow.creationTimestamp.s,1000),sprig.mul(sprig.ceil(workflow.duration),1500))}}%22%7D%7D\"
                        }
                        ,{
                          \"type\": \"button\",
                          \"text\": {
                            \"type\": \"plain_text\",
                            \"text\": \"Cron jobs Dashboard\"
                          },
                          \"url\": \"https://grafana-preview.shopline.io/d/C2JiGi64k/api-cronjobs-memory?orgId=1&var-cluster=ec-eks-db&var-namespace={{workflow.namespace}}&var-job={{=sprig.join('-',sprig.initial(sprig.regexSplit('-',workflow.name,-1)))}}&var-container=main&from={{=sprig.mul(workflow.creationTimestamp.s,1000)}}&to={{=sprig.add(sprig.mul(workflow.creationTimestamp.s,1000),sprig.mul(sprig.ceil(workflow.duration),1500))}}\"
                        }
                        ,{
                          \"type\": \"button\",
                          \"text\": {
                            \"type\": \"plain_text\",
                            \"text\": \"OpenTelemetry Trace\"
                          },
                          \"url\": \"https://grafana-preview.shopline.io/explore?left=%7B%22datasource%22:%22tempo%22,%22queries%22:%5B%7B%22queryType%22:%22nativeSearch%22,%22refId%22:%22A%22,%22limit%22:20,%22serviceName%22:%22api.shoplineapp.com-ArgoCronWorkflow%20%28Production%29%22,%22search%22:%22workflow.name%3D%5C%22{{workflow.name}}%5C%22%22%7D%5D,%22range%22:%7B%22from%22:%22{{=sprig.mul(workflow.creationTimestamp.s,1000)}}%22,%22to%22:%22{{=sprig.add(sprig.mul(workflow.creationTimestamp.s,1000),sprig.mul(sprig.ceil(workflow.duration),1500))}}%22%7D%7D\"
                        }
                      ]
                    }
                  ]
                }
              ]}'
             https://hooks.slack.com/services/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"
            ]
        - name: notice-slack-app-failed
          container:
            image: 332947256684.dkr.ecr.ap-southeast-1.amazonaws.com/curlimages/curl:8.4.0
            command: [sh, -c]
            args: [
              "curl -X POST --tlsv1.2 --retry 3 --retry-all-errors --fail -H 'Content-type: application/json' --data '{\"attachments\": [
                {
                  \"fallback\": \"Workflow Failed - {{workflow.name}}\",
                  \"color\": \"#E01E5A\",
                  \"blocks\": [
                    {
                      \"type\": \"header\",
                      \"text\": {
                        \"type\": \"plain_text\",
                        \"text\": \"Workflow Failed - {{workflow.name}}\",
                        \"emoji\": true
                      }
                    },
                    {
                      \"type\": \"divider\"
                    },
                    {
                      \"type\": \"section\",
                      \"fields\": [
                        {
                          \"type\": \"mrkdwn\",
                          \"text\": \"*Cluster*\\nec-eks-db-preview\"
                        },
                        {
                          \"type\": \"mrkdwn\",
                          \"text\": \"*Namespace*\\n{{workflow.namespace}}\"
                        },
                        {
                          \"type\": \"mrkdwn\",
                          \"text\": \"*Scheduled Time*\\n{{workflow.scheduledTime}}\"
                        },
                        {
                          \"type\": \"mrkdwn\",
                          \"text\": \"*Duration*\\n{{workflow.duration}} sec\"
                        }
                      ]
                    }
                    ,{
                      \"type\": \"actions\",
                      \"elements\": [
                        {
                          \"type\": \"button\",
                          \"text\": {
                            \"type\": \"plain_text\",
                            \"text\": \"Argo Dashboard\"
                          },
                          \"url\": \"https://argo-workflows-preview.shopline.io/workflows/{{workflow.namespace}}/{{workflow.name}}?tab=workflow\"
                        }
                        ,{
                          \"type\": \"button\",
                          \"text\": {
                            \"type\": \"plain_text\",
                            \"text\": \"App Logs\"
                          },
                          \"url\": \"https://grafana-preview.shopline.io/explore?orgId=1&left=%7B%22datasource%22:%22loki%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22loki%22%7D,%22editorMode%22:%22builder%22,%22expr%22:%22%7Bcluster%3D%5C%22ec-eks-db%5C%22,%20namespace%3D%5C%22{{workflow.namespace}}%5C%22,%20pod%3D~%5C%22{{workflow.name}}.%2A%5C%22%7D%20%7C%3D%20%60%60%22,%22queryType%22:%22range%22%7D%5D,%22range%22:%7B%22from%22:%22{{=sprig.mul(workflow.creationTimestamp.s,1000)}}%22,%22to%22:%22{{=sprig.add(sprig.mul(workflow.creationTimestamp.s,1000),sprig.mul(sprig.ceil(workflow.duration),1500))}}%22%7D%7D\"
                        }
                        ,{
                          \"type\": \"button\",
                          \"text\": {
                            \"type\": \"plain_text\",
                            \"text\": \"K8s Events\"
                          },
                          \"url\": \"https://grafana-preview.shopline.io/explore?orgId=1&left=%7B%22datasource%22:%22loki%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22loki%22%7D,%22editorMode%22:%22builder%22,%22expr%22:%22%7Bapp%3D%5C%22kubernetes-event-exporter%5C%22%7D%20%7C%3D%20%60{{workflow.name}}%60%22,%22queryType%22:%22range%22%7D%5D,%22range%22:%7B%22from%22:%22{{=sprig.mul(workflow.creationTimestamp.s,1000)}}%22,%22to%22:%22{{=sprig.add(sprig.mul(workflow.creationTimestamp.s,1000),sprig.mul(sprig.ceil(workflow.duration),1500))}}%22%7D%7D\"
                        }
                        ,{
                          \"type\": \"button\",
                          \"text\": {
                            \"type\": \"plain_text\",
                            \"text\": \"Cron jobs Dashboard\"
                          },
                          \"url\": \"https://grafana-preview.shopline.io/d/C2JiGi64k/api-cronjobs-memory?orgId=1&var-cluster=ec-eks-db&var-namespace={{workflow.namespace}}&var-job={{=sprig.join('-',sprig.initial(sprig.regexSplit('-',workflow.name,-1)))}}&var-container=main&from={{=sprig.mul(workflow.creationTimestamp.s,1000)}}&to={{=sprig.add(sprig.mul(workflow.creationTimestamp.s,1000),sprig.mul(sprig.ceil(workflow.duration),1500))}}\"
                        }
                        ,{
                          \"type\": \"button\",
                          \"text\": {
                            \"type\": \"plain_text\",
                            \"text\": \"OpenTelemetry Trace\"
                          },
                          \"url\": \"https://grafana-preview.shopline.io/explore?left=%7B%22datasource%22:%22tempo%22,%22queries%22:%5B%7B%22queryType%22:%22nativeSearch%22,%22refId%22:%22A%22,%22limit%22:20,%22serviceName%22:%22api.shoplineapp.com-ArgoCronWorkflow%20%28Production%29%22,%22search%22:%22workflow.name%3D%5C%22{{workflow.name}}%5C%22%22%7D%5D,%22range%22:%7B%22from%22:%22{{=sprig.mul(workflow.creationTimestamp.s,1000)}}%22,%22to%22:%22{{=sprig.add(sprig.mul(workflow.creationTimestamp.s,1000),sprig.mul(sprig.ceil(workflow.duration),1500))}}%22%7D%7D\"
                        }
                      ]
                    }
                  ]
                }
              ]}'
              https://hooks.slack.com/services/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"
            ]
        # If .Values.exitNotifications.newRelic is set, New Relic notification template will be loaded
        - name: notice-newrelic-failed
          container:
            image: '332947256684.dkr.ecr.ap-southeast-1.amazonaws.com/newrelic-agent:1.0.2'
            env:
              - name: NEWRELIC_APP_NAME
                value: "Argo Cron Workflow (Preview)"
pick 8c4dc43 feat(): add cost tag for cronjobs
              - name: FUNCTION_NAME
                value: "google-feed-job"
              - name: NEWRELIC_LICENSE_KEY
                value: "8c458eaaf3702d01fea551c57d4f9d4a8a374f8e"
              - name: ARGO_WORKFLOW_ERROR
                value: "{{workflow.failures}}"
              - name: ARGO_WORKFLOW_NAME
                value: "{{workflow.name}}"
              - name: ARGO_WORKFLOW_STATUS
                value: "{{workflow.status}}"
              - name: ARGO_WORKFLOW_DURATION
                value: "{{workflow.duration}}"
        # If .Values.exitNotifications.healthcheckIo is set, Healthcheck IO notification template will be loaded
        - name: notice-healthcheck-io-succeeded # For cronjob health check, as the schedule may different therefore each cronjob will have different uuid
          container:
            image: 332947256684.dkr.ecr.ap-southeast-1.amazonaws.com/curlimages/curl:8.4.0
            command: [ "sh", "-c" ]
            args:
              - curl https://hc-ping.com./11b635d9-a389-4a76-a51a-f9d28c9daab3 --tlsv1.2 --retry 3 --retry-all-errors --fail
        - name: notice-healthcheck-io-failed
          container:
            image: 332947256684.dkr.ecr.ap-southeast-1.amazonaws.com/curlimages/curl:8.4.0
            command: [ "sh", "-c" ]
            args:
              - curl https://hc-ping.com./11b635d9-a389-4a76-a51a-f9d28c9daab3/fail --tlsv1.2 --retry 3 --retry-all-errors --fail
      ttlStrategy:
        # The second of the pod can be alive after the job is done
        secondsAfterCompletion: 86400

      # The mechanism for garbage collecting completed pods. There is default value "OnPodCompletion"
      podGC:
        strategy: OnPodCompletion

run26kimo
run26kimo previously approved these changes May 2, 2024
@@ -1,6 +1,6 @@
apiVersion: v1
description: Helm chart with simple cronjob template
name: cronjob
version: 0.7.5
version: 0.7.6
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

應該要進minor version, 這一個PR是feature不是patch

Copy link
Collaborator

@acgs771126 acgs771126 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

chart version should change minor version

Copy link
Collaborator

@acgs771126 acgs771126 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants