Skip to content

Commit

Permalink
Add medium, large job types to GX job (#154)
Browse files Browse the repository at this point in the history
* upgrade to v2 of ecr login action

* Revert "upgrade to v2 of ecr login action"

This reverts commit bcfef0e.

* Add medium, large job types to GX job
  • Loading branch information
philerooski authored Nov 19, 2024
1 parent 8901de2 commit 86d8f2c
Showing 1 changed file with 45 additions and 3 deletions.
48 changes: 45 additions & 3 deletions templates/glue-job-run-great-expectations-on-parquet.j2
Original file line number Diff line number Diff line change
Expand Up @@ -43,16 +43,45 @@ Parameters:
DefaultWorkerType:
Type: String
Description: >-
Which worker type to use for this job.
Which worker type to use for most data types
Default: 'Standard'

MediumJobWorkerType:
Type: String
Description: >-
Which worker type to use for this job.
Medium data types include: HealthKitV2Samples, HealthKitV2Electrocardiogram,
FitbitDailyData, FitbitSleepLogs
Default: 'G.4X'

LargeJobWorkerType:
Type: String
Description: >-
Which worker type to use for this job.
Large data types include: FitbitIntradayCombined
Default: 'G.8X'

DefaultNumberOfWorkers:
Type: Number
Description: >-
How many DPUs to allot to this job. This parameter is not used for types
FitbitIntradayCombined and HealthKitV2Samples.
How many DPUs to allot for most data types.
Default: 1

MediumJobNumberOfWorkers:
Type: Number
Description: >-
How many DPUs to allot to this job.
Medium data types include: HealthKitV2Samples, HealthKitV2Electrocardiogram,
FitbitDailyData, FitbitSleepLogs
Default: 4

LargeJobNumberOfWorkers:
Type: Number
Description: >-
How many DPUs to allot to this job.
Large data types include: FitbitIntradayCombined
Default: 8

ExpectationSuiteKey:
Type: String
Description: The S3 key of the GX expectation file.
Expand Down Expand Up @@ -115,8 +144,21 @@ Resources:
GlueVersion: !Ref GlueVersion
MaxRetries: !Ref MaxRetries
Name: !Sub "${Namespace}-{{ dataset["stackname_prefix"] }}-GreatExpectationsParquetJob"
{% if dataset["type"] == "FitbitIntradayCombined" -%}
WorkerType: !Ref LargeJobWorkerType
NumberOfWorkers: !Ref LargeJobNumberOfWorkers
{% elif (
dataset["type"] == "HealthKitV2Samples"
or dataset["type"] == "HealthKitV2Electrocardiogram"
or dataset["type"] == "FitbitDailyData"
or dataset["type"] == "FitbitSleepLogs"
) -%}
WorkerType: !Ref MediumJobWorkerType
NumberOfWorkers: !Ref MediumJobNumberOfWorkers
{% else -%}
WorkerType: !Ref DefaultWorkerType
NumberOfWorkers: !Ref DefaultNumberOfWorkers
{%- endif %}
Role: !Ref JobRole
Timeout: !Ref TimeoutInMinutes
{% endfor %}

0 comments on commit 86d8f2c

Please sign in to comment.