Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update for mono repo #111

Merged
merged 126 commits into from
Mar 8, 2024
Merged
Show file tree
Hide file tree
Changes from 116 commits
Commits
Show all changes
126 commits
Select commit Hold shift + click to select a range
b4d5995
Deploy AIRFLOW to GHGC UAH
Apr 19, 2023
764658a
Deploy AIRFLOW to GHGC UAH
Apr 19, 2023
36a306f
Deploy AIRFLOW to GHGC UAH
Apr 19, 2023
dedae97
Deploy AIRFLOW to GHGC UAH
Apr 19, 2023
99d04be
Deploy AIRFLOW to GHGC UAH
Apr 19, 2023
d236065
Deploy AIRFLOW to GHGC UAH
Apr 19, 2023
1ddf415
Deploy AIRFLOW to GHGC UAH
Apr 19, 2023
01bf7b8
Setup gitflow
Apr 19, 2023
a813dff
Provision RDS cluster
May 1, 2023
1bd28ea
Update github actions and update bucket permissions (#2)
slesaad May 9, 2023
9f200b1
Merge branch 'dev' of https://github.com/NASA-IMPACT/ghgc-data-airflo…
May 10, 2023
abbba97
Fix CICD
May 10, 2023
e8c9aeb
Fix CICD
May 10, 2023
504cb2e
Fix CICD
May 10, 2023
95fac5e
Fix CICD
May 10, 2023
6ea7a84
Fix CICD
May 10, 2023
2886922
Fix CICD
May 10, 2023
661f506
Fix CICD
May 10, 2023
50a8626
Fix CICD
May 10, 2023
bd34c37
Fix CICD
May 10, 2023
7a2ad14
Fix CICD
May 10, 2023
78a649c
Fix CICD and remove optional lines
May 10, 2023
6c4332f
Fix CICD and remove optional lines with permission
May 10, 2023
66736aa
Fix CICD and remove optional lines with vpc
May 10, 2023
c8e4696
Fix CICD and remove optional lines with vpc and subnet
May 10, 2023
fc5a40e
Fix CICD fix vpc and subnet
May 10, 2023
69a48ec
Fix CICD fix variables
May 10, 2023
c9e2afe
Fix optional var references in dags (#3)
slesaad May 25, 2023
e6c76ef
Fix containerOverrides expected datatype
slesaad May 25, 2023
e675b71
Uncomment subnets
slesaad May 25, 2023
ffe955b
Fix GDAL session for non assume role
slesaad May 25, 2023
9535729
Add some logging
slesaad May 25, 2023
cb3f2f5
Throw exception if stac build fails
slesaad May 25, 2023
bd176d8
Update requirements
slesaad May 25, 2023
7009c1a
Upgrading the TF base module to v1.1.4
May 30, 2023
37a1bb8
Add reference to ghc-deploy-dev-mcp branch
Caden-Helbling Jun 12, 2023
ec5c619
Update subnet tagname (#4)
slesaad Jun 14, 2023
ebb1309
Update env files (#5)
slesaad Jun 14, 2023
6b30c83
Update env files (#6)
slesaad Jun 14, 2023
6f1b34a
Update boundary policy name (#7)
slesaad Jun 14, 2023
a8fa230
Update boundary policy name (#8)
slesaad Jun 14, 2023
92f7aa5
Try deployment to staging
amarouane-ABDELHAK Jun 15, 2023
b7073bc
Try deployment to staging
amarouane-ABDELHAK Jun 15, 2023
ba3e7d2
Merge pull request #9 from NASA-IMPACT/main
amarouane-ABDELHAK Jun 15, 2023
5234648
Deploy MWAA
amarouane-ABDELHAK Jun 26, 2023
2e3032a
Deploy MWAA
amarouane-ABDELHAK Jun 26, 2023
f687e1f
Delete .envs
amarouane-ABDELHAK Jun 26, 2023
d108471
Add vector-secret-name
amarouane-ABDELHAK Jun 26, 2023
21673c5
Add MWAA variables
amarouane-ABDELHAK Jun 26, 2023
3785fcc
Add comma for airflow vars:
amarouane-ABDELHAK Jun 26, 2023
b1e90ca
Finalizing the deployment
amarouane-ABDELHAK Jul 5, 2023
67af4b3
Add ingestor url
amarouane-ABDELHAK Jul 5, 2023
8e5f360
Add deploy debugger
amarouane-ABDELHAK Jul 5, 2023
2737807
Add deploy debugger
amarouane-ABDELHAK Jul 5, 2023
086b9cb
deploy with correct ingestor
amarouane-ABDELHAK Jul 5, 2023
d7ca801
Change stack name
amarouane-ABDELHAK Jul 6, 2023
2778d97
Make permission boundaries name optional
amarouane-ABDELHAK Jul 17, 2023
f8f88ee
Make permission boundaries name optional
amarouane-ABDELHAK Jul 17, 2023
18a366f
Merge branch 'deploy_all_stack' into dev
amarouane-ABDELHAK Jul 26, 2023
1312d3f
Fix permission boundaries name
amarouane-ABDELHAK Jul 26, 2023
2512984
Remove public access for S3
amarouane-ABDELHAK Jul 26, 2023
f92c171
Merge changes from upstream (#11)
slesaad Aug 1, 2023
5ef139b
Update `action.yml`
slesaad Aug 1, 2023
3456a8e
Fix env vars
slesaad Aug 1, 2023
2efb4c8
Source env var
slesaad Aug 1, 2023
5ffa815
Fix prefix env
slesaad Aug 1, 2023
8351234
Update action to deploy
slesaad Aug 1, 2023
ae4a2d7
Update stac_ingestor_url env var
slesaad Aug 1, 2023
1f1778c
Update exported env vars
slesaad Aug 1, 2023
e173407
Update dummy dag
slesaad Aug 1, 2023
a7ad81c
Update dummy dag again
slesaad Aug 1, 2023
7ca5707
Delete submodule
slesaad Aug 1, 2023
8003455
Use mwaa_tf_module v1.1.5
slesaad Aug 1, 2023
8fb576b
Update rasterio version
slesaad Aug 1, 2023
c8524b7
Make assume_role optional in cogify/transfer
slesaad Aug 1, 2023
861cd91
Fix inconsistencies
slesaad Aug 2, 2023
e0bc3d2
Update build_stac readme
slesaad Aug 2, 2023
f214907
Fix rasterio open issue
slesaad Aug 2, 2023
a9a81e0
Fix iteration of dict items
slesaad Aug 2, 2023
af930d0
Update handler
slesaad Aug 2, 2023
e469267
Add newline
slesaad Aug 2, 2023
14ddd5c
Merge veda/dev into dev
slesaad Aug 4, 2023
7c92726
Merge pull request #12 from US-GHG-Center/dev
amarouane-ABDELHAK Aug 4, 2023
97a52a1
Fix sts session for non role assumption
slesaad Aug 4, 2023
e6b69a2
Update mwaa version
slesaad Aug 4, 2023
0cec9e1
Update `mwaa_tf_module` version, fix sts session bug (#13)
slesaad Aug 7, 2023
a9d7c73
Support for multi-asset items, upgrade mwaa_tf version
slesaad Aug 7, 2023
6d3a245
Make `ASSUME_ROLE_READ_ARN` optional
slesaad Aug 8, 2023
5b3b710
Update exception catching
slesaad Aug 8, 2023
6ac6bdf
`Str` param type for empty `CopySourceIfNoneMatch`
slesaad Aug 8, 2023
40c4e0f
Update permissions for specified buckets
slesaad Aug 8, 2023
245d0ef
Give access to all buckets starting with `ghgc-`
slesaad Aug 8, 2023
116c808
Fix data transfer copy error (#15)
slesaad Aug 9, 2023
1fcdd56
Merge branch 'production' into main
slesaad Aug 9, 2023
374cab0
Specify data asset roles, change active runs
slesaad Aug 23, 2023
41f88c6
Specify asset role, change dag max run for build_stac (#17)
slesaad Aug 23, 2023
fdfd528
Move proj and raster details into each asset (#19)
slesaad Sep 20, 2023
8d311b5
Update format for start and end datetime (#21)
slesaad Sep 20, 2023
537f43d
Merge branch 'main' into dev
slesaad Sep 20, 2023
83fc487
Add stac url output
amarouane-ABDELHAK Oct 30, 2023
3fda022
update lock
amarouane-ABDELHAK Oct 30, 2023
10491ca
update prefix
amarouane-ABDELHAK Oct 31, 2023
e3e294e
Add `VEDA_` prefix to some env vars
slesaad Feb 22, 2024
d55f818
Remove `ingestor_stack_name` reference
slesaad Feb 23, 2024
1df5bde
Update action versions
slesaad Feb 23, 2024
c62d2f7
Prepend `veda_` to var
slesaad Feb 23, 2024
0733999
Prepend missing `VEDA_`
slesaad Feb 23, 2024
427e419
Merge branch 'dev' into update-for-mono-repo
slesaad Feb 27, 2024
7911753
Merge branch 'dev' into update-for-mono-repo
slesaad Mar 5, 2024
27a9c6d
Create `jwks_url` from aws secrets, remove unused functions
slesaad Mar 5, 2024
70c0d92
Update env var names
slesaad Mar 6, 2024
916adc5
Fix multi-asset metadata generation for proj and raster bands
slesaad Mar 6, 2024
2707890
Update role session name
slesaad Mar 6, 2024
9f0a097
Remove remenant `veda_` prefix
slesaad Mar 6, 2024
518b0b5
Remove unnecessary branch from cicd
slesaad Mar 6, 2024
c386c91
`ghgc` -> `veda` in `sync-env.sh`
slesaad Mar 6, 2024
aa54162
Merge branch 'dev' into update-for-mono-repo
slesaad Mar 7, 2024
82b062d
Make vector_vpc optional, add permission boundary to lambda role
slesaad Mar 7, 2024
54d8ffd
Fix var name
slesaad Mar 7, 2024
554da6c
Make vector sg optional
slesaad Mar 7, 2024
d3f1b48
workflow api fixes, sts assume role
smohiudd Mar 8, 2024
e59d2a7
workflows and tf changes
smohiudd Mar 8, 2024
61447cc
fix task def name
smohiudd Mar 8, 2024
35c19d3
split programmatic and workflow auth clients in tf
smohiudd Mar 8, 2024
692eb27
Update env var name for cognito client secret
slesaad Mar 8, 2024
e749c4b
Update .env.example
slesaad Mar 8, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -9,5 +9,9 @@ STATE_BUCKET_NAME=<Fill Me>
STATE_BUCKET_KEY=<Fill Me>
STATE_DYNAMO_TABLE=<Fill Me>
ASSUME_ROLE_ARNS='["<Read role>", "<Write role>"]'
COGNITO_APP_SECRET=<Fill Me>
STAC_INGESTOR_API_URL=<Fill Me>
VEDA_WORKFLOWS_CLIENT_SECRET_ID=<Fill Me>
slesaad marked this conversation as resolved.
Show resolved Hide resolved
VEDA_STAC_INGESTOR_API_URL=<Fill Me>
VEDA_RASTER_URL=<Fill Me>
VEDA_DATA_ACCESS_ROLE_ARN=<Fill Me>
VEDA_STAC_URL=<Fill Me>
WORKFLOW_ROOT_PATH=<Fill Me>
2 changes: 1 addition & 1 deletion .flake8
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
[flake8]
# taken from github actions ignore
ignore = E1, E2, E3, E5, W1, W2, W3, W5
ignore = E1, E2, E3, E5, W1, W2, W3, W5
64 changes: 64 additions & 0 deletions .github/actions/terraform-deploy/action.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
name: Deploy

inputs:
env_aws_secret_name:
required: true
type: string
env-file:
type: string
default: ".env"
dir:
required: false
type: string
default: "."
script_path:
type: string
backend_stack_name:
type: string
auth_stack_name:
type: string

runs:
using: "composite"

steps:
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.10"
cache: "pip"

- name: Install python dependencies
shell: bash
working-directory: ${{ inputs.dir }}
run: pip install -r deploy_requirements.txt

- name: Get relevant environment configuration from aws secrets
shell: bash
working-directory: ${{ inputs.dir }}
env:
SECRET_SSM_NAME: ${{ inputs.env_aws_secret_name }}
AWS_DEFAULT_REGION: us-west-2
run: |
if [[ -z "${{ inputs.script_path }}" ]]; then
./scripts/sync-env.sh ${{ inputs.env_aws_secret_name }}
else
echo ${{ inputs.auth_stack_name}}
echo ${{ inputs.backend_stack_name}}
python ${{ inputs.script_path }} --secret-id ${{ inputs.env_aws_secret_name }} --stack-names ${{ inputs.auth_stack_name}},${{ inputs.backend_stack_name}}
source .env
echo "PREFIX=data-pipeline-$STAGE" >> ${{ inputs.env-file }}
cat .env
fi

- name: Setup Terraform
uses: hashicorp/setup-terraform@v3
with:
terraform_version: 1.3.3

- name: Deploy
shell: bash
working-directory: ${{ inputs.dir }}
run: |
./scripts/deploy.sh ${{ inputs.env-file }} <<< init
./scripts/deploy.sh ${{ inputs.env-file }} <<< deploy
136 changes: 61 additions & 75 deletions .github/workflows/cicd.yml
Original file line number Diff line number Diff line change
@@ -1,91 +1,77 @@
name: CI/CD
name: CICD 🚀

permissions:
id-token: write
contents: read

on:
push:
branches:
- main
- dev
- production
pull_request:
branches:
- main
- dev
- production
types: [ opened, reopened, edited, synchronize ]

jobs:
gitflow-enforcer:
runs-on: ubuntu-latest
steps:
- name: Check branch
run: |
if [[ $GITHUB_BASE_REF == "main" ]]; then
if [[ $GITHUB_HEAD_REF != "dev" && $GITHUB_HEAD_REF != "revert-"* ]]; then
echo "ERROR: You can only merge to 'main' from 'dev' or a 'revert-*' branch"
exit 1
fi
elif [[ $GITHUB_BASE_REF == "production" ]]; then
if [[ $GITHUB_HEAD_REF != "main" && $GITHUB_HEAD_REF != "revert-"* ]]; then
echo "ERROR: You can only merge to 'production' from 'main' or a 'revert-*' branch"
exit 1
fi
fi


run-linters:
name: Run linters
name: GitFlow Enforcer 👮‍
runs-on: ubuntu-latest
needs: gitflow-enforcer
steps:
- name: Check branch
if: github.base_ref == 'main' && github.head_ref != 'dev' || github.base_ref == 'production' && github.head_ref != 'main'
run: |
echo "ERROR: You can only merge to main from dev and to production from main"
exit 1

define-environment:
name: Set ✨ environment ✨
needs: gitflow-enforcer
runs-on: ubuntu-latest
steps:
- name: Check out Git repository
uses: actions/checkout@v2
- name: Set the environment based on the branch
id: define_environment
run: |
if [ "${{ github.ref }}" = "refs/heads/main" ]; then
echo "env_name=staging" >> $GITHUB_OUTPUT
elif [ "${{ github.ref }}" = "refs/heads/dev" ]; then
echo "env_name=development" >> $GITHUB_OUTPUT
elif [ "${{ github.ref }}" = "refs/heads/production" ]; then
echo "env_name=production" >> $GITHUB_OUTPUT
fi
- name: Print the environment
run: echo "The environment is ${{ steps.define_environment.outputs.env_name }}"

- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.10"
outputs:
env_name: ${{ steps.define_environment.outputs.env_name }}

- name: Install Python dependencies
run: pip install black flake8
deploy:
name: Deploy to ${{ needs.define-environment.outputs.env_name }} 🚀
runs-on: ubuntu-latest
if: ${{ needs.define-environment.outputs.env_name }}
needs: [gitflow-enforcer, define-environment]
environment: ${{ needs.define-environment.outputs.env_name }}
concurrency: ${{ needs.define-environment.outputs.env_name }}

- name: Run linters
uses: wearerequired/lint-action@v2
steps:
- name: Checkout
uses: actions/checkout@v3
with:
continue_on_error: false
black: true
flake8: true
flake8_args: "--ignore E1,E2,E3,E5,W1,W2,W3,W5" # black already handles formatting, this prevents conflicts

deploy-to-dev:
needs: run-linters
if: github.ref_name == 'dev'
concurrency: development
uses: "./.github/workflows/deploy.yml"
with:
environment: development
env-file: ".env_dev"
stage: "dev"
role-session-name: "veda-data-airflow-github-development-deployment"
aws-region: "us-west-2"

secrets: inherit

deploy-to-staging:
needs: run-linters
if: github.ref_name == 'main'
concurrency: staging
uses: "./.github/workflows/deploy.yml"
with:
environment: staging
env-file: ".env_staging"
stage: "staging"
role-session-name: "veda-data-airflow-github-staging-deployment"
aws-region: "us-west-2"
secrets: inherit

deploy-to-production:
needs: run-linters
if: github.ref_name == 'production'
concurrency: production
uses: "./.github/workflows/deploy.yml"
with:
environment: production
env-file: ".env_prod"
stage: "production"
role-session-name: "veda-data-airflow-github-production-deployment"
aws-region: "us-west-2"
lfs: "true"
submodules: "recursive"

- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v2
with:
role-to-assume: ${{ secrets.DEPLOYMENT_ROLE_ARN }}
anayeaye marked this conversation as resolved.
Show resolved Hide resolved
role-session-name: "veda-airflow-github-${{ needs.define-environment.outputs.env_name }}-deployment"
aws-region: "us-west-2"

secrets: inherit
- name: Run deployment
uses: "./.github/actions/terraform-deploy"
with:
env_aws_secret_name: ${{ secrets.ENV_AWS_SECRET_NAME }}
89 changes: 0 additions & 89 deletions .github/workflows/deploy.yml

This file was deleted.

1 change: 0 additions & 1 deletion dags/requirements-constraints.txt
Original file line number Diff line number Diff line change
Expand Up @@ -631,4 +631,3 @@ zipp==3.10.0
zope.event==4.5.0
zope.interface==5.5.1
zstandard==0.19.0

2 changes: 1 addition & 1 deletion dags/veda_data_pipeline/groups/discover_group.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ def discover_from_s3_task(ti):
config = ti.dag_run.conf
# (event, chunk_size=2800, role_arn=None, bucket_output=None):
MWAA_STAC_CONF = Variable.get("MWAA_STACK_CONF", deserialize_json=True)
read_assume_arn = Variable.get("ASSUME_ROLE_READ_ARN")
read_assume_arn = Variable.get("ASSUME_ROLE_READ_ARN", default_var=None)
return s3_discovery_handler(
event=config,
role_arn=read_assume_arn,
Expand Down
2 changes: 1 addition & 1 deletion dags/veda_data_pipeline/groups/processing_group.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ def subdag_process():
"environment": [
{
"name": "EXTERNAL_ROLE_ARN",
"value": Variable.get("ASSUME_ROLE_READ_ARN"),
"value": Variable.get("ASSUME_ROLE_READ_ARN", default_var=''),
},
{
"name": "BUCKET",
Expand Down
6 changes: 4 additions & 2 deletions dags/veda_data_pipeline/groups/transfer_group.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ def cogify_choice(ti):
def transfer_data(ti):
"""Transfer data from one S3 bucket to another; s3 copy, no need for docker"""
config = ti.dag_run.conf
role_arn = Variable.get("ASSUME_ROLE_READ_ARN")
role_arn = Variable.get("ASSUME_ROLE_READ_ARN", default_var="")
# (event, chunk_size=2800, role_arn=None, bucket_output=None):
return data_transfer_handler(event=config, role_arn=role_arn)

Expand Down Expand Up @@ -66,7 +66,9 @@ def subdag_transfer():
"environment": [
{
"name": "EXTERNAL_ROLE_ARN",
"value": Variable.get("ASSUME_ROLE_READ_ARN"),
"value": Variable.get(
"ASSUME_ROLE_READ_ARN", default_var=""
),
},
],
"memory": 2048,
Expand Down
3 changes: 2 additions & 1 deletion dags/veda_data_pipeline/veda_process_raster_pipeline.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
"discovery": "s3",
"datetime_range": "month",
"discovered": 33,
"payload": "s3://veda-uah-sit-mwaa-853558080719/events/geoglam/s3_discover_output_6c46b57a-7474-41fe-977a-19d164531cdc.json"
"payload": "s3://veda-uah-sit-mwaa-853558080719/events/geoglam/s3_discover_output_6c46b57a-7474-41fe-977a-.json"
}
```
- [Supports linking to external content](https://github.com/NASA-IMPACT/veda-data-pipelines)
Expand All @@ -36,6 +36,7 @@
"payload": "<s3_uri_event_payload",
}
dag_args = {
"max_active_runs": 20,
"start_date": pendulum.today("UTC").add(days=-1),
"schedule_interval": None,
"catchup": False,
Expand Down
Loading