Deploy new gardener and k8s etl parser to prod #305

gfr10598 · 2020-08-05T21:49:29Z

Looks like prod mostly runs in us-central instead of east region. So the new k8s cluster should probably be there too.

There is some documentation in the README.md file from January.

Steps:

Create data-processing cluster, with appropriate networking options
Create node-pools for etl and gardener.
Add cloud builder rule for etl prod- tags.

gfr10598 · 2020-08-05T21:53:32Z

Based on info in README.md, added create-cluster.sh in new branch, which has all the gcloud commands to set up the network, subnet, firewall rules, cluster, and node-pools.

gfr10598 · 2020-08-06T13:34:03Z

Manually added cloud build trigger. Note that gcloud beta builds now supports creating triggers, too.

gcloud beta builds triggers create github
--repo-name=[REPO_NAME]
--repo-owner=[REPO_OWNER]
--branch-pattern=".*"
--build-config=[BUILD_CONFIG_FILE] \

gfr10598 · 2020-08-06T23:59:24Z

bq --project=mlab-oti mk tmp_ndt
bq --project=mlab-oti mk raw_ndt

Need to add the table creation and schema updates to etl-schema.

gfr10598 · 2020-08-07T13:07:39Z

CREATE OR REPLACE TABLE mlab-oti.raw_ndt.ndt7
PARTITION BY date CLUSTER BY metro
AS
SELECT date, REGEXP_EXTRACT(parser.ArchiveURL , ".-mlab[1-4]-([a-z]{3})[0-9]{2}.") AS metro, id, * EXCEPT(date,id)
FROM mlab-sandbox.tmp_ndt.ndt7
WHERE date > CURRENT_DATE()

gfr10598 · 2020-08-07T13:08:59Z

CREATE OR REPLACE TABLE mlab-oti.raw_ndt.annotation
PARTITION BY date CLUSTER BY metro
AS
SELECT date, REGEXP_EXTRACT(parser.ArchiveURL , ".-mlab[1-4]-([a-z]{3})[0-9]{2}.") AS metro, id, * EXCEPT(date,id)
FROM mlab-sandbox.tmp_ndt.annotation
WHERE date > CURRENT_DATE()

gfr10598 · 2020-08-07T13:28:11Z

NOTE: bigquery does not store data in us-central. This may mean that we will get network egress charges for the BQ loads?

Probably should specify the BQ dataset data_location=US to make it multi-regional. See https://cloud.google.com/bigquery/docs/locations#multi-regional-locations

The documentation is not crystal clear, so we should probably just look for these charges in billing.

autolabel bot added the review/triage Team should review and assign priority label Aug 5, 2020

laiyi-ohlsen removed the review/triage Team should review and assign priority label Sep 28, 2020

gfr10598 self-assigned this Oct 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deploy new gardener and k8s etl parser to prod #305

Deploy new gardener and k8s etl parser to prod #305

gfr10598 commented Aug 5, 2020 •

edited

Loading

gfr10598 commented Aug 5, 2020 •

edited

Loading

gfr10598 commented Aug 6, 2020

gfr10598 commented Aug 6, 2020

gfr10598 commented Aug 7, 2020

gfr10598 commented Aug 7, 2020

gfr10598 commented Aug 7, 2020

Deploy new gardener and k8s etl parser to prod #305

Deploy new gardener and k8s etl parser to prod #305

Comments

gfr10598 commented Aug 5, 2020 • edited Loading

gfr10598 commented Aug 5, 2020 • edited Loading

gfr10598 commented Aug 6, 2020

gfr10598 commented Aug 6, 2020

gfr10598 commented Aug 7, 2020

gfr10598 commented Aug 7, 2020

gfr10598 commented Aug 7, 2020

gfr10598 commented Aug 5, 2020 •

edited

Loading

gfr10598 commented Aug 5, 2020 •

edited

Loading