Skip to content

Commit

Permalink
Merge pull request #3187 from GSA-TTS/main
Browse files Browse the repository at this point in the history
  • Loading branch information
jadudm authored Jan 11, 2024
2 parents 68c5903 + e1352b4 commit 909dc8a
Show file tree
Hide file tree
Showing 25 changed files with 387 additions and 135 deletions.
101 changes: 0 additions & 101 deletions .github/workflows/database-backups.yml

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
name: Historic Data Migration
name: Historic Data Migration With Pagination
on:
workflow_dispatch:
inputs:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/historic-data-migrator.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ on:
dbkeys:
required: true
type: string
description: Comma-separated list of report-IDs.
description: Comma-separated list of dbkeys.
years:
required: true
type: string
Expand Down
15 changes: 15 additions & 0 deletions .github/workflows/scheduled-dev-snapshot.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
---
name: Development Media Snapshot
on:
schedule:
# Invoke at 9 UTC every monday
- cron: '0 9 * * 1'
workflow_dispatch: null

jobs:
trivy-scan:
uses: ./.github/workflows/tar-s3-media.yml
secrets: inherit
with:
environment: "dev"

15 changes: 15 additions & 0 deletions .github/workflows/scheduled-dev-sync.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
---
name: Sync Dev Media Files
on:
schedule:
# Invoke every 2 hours
- cron: '0 */2 * * *'
workflow_dispatch: null

jobs:
trivy-scan:
uses: ./.github/workflows/sync-s3-media.yml
secrets: inherit
with:
environment: "dev"

15 changes: 15 additions & 0 deletions .github/workflows/scheduled-production-snapshot.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
---
name: Production Media Snapshot
on:
#schedule:
# Invoke at 9 UTC every monday
#- cron: '0 9 * * 1'
workflow_dispatch: null

jobs:
trivy-scan:
uses: ./.github/workflows/tar-s3-media.yml
secrets: inherit
with:
environment: "production"

15 changes: 15 additions & 0 deletions .github/workflows/scheduled-production-sync.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
---
name: Sync Production Media Files
on:
#schedule:
# Invoke every 2 hours
# - cron: '0 */2 * * *'
workflow_dispatch: null

jobs:
trivy-scan:
uses: ./.github/workflows/sync-s3-media.yml
secrets: inherit
with:
environment: "production"

64 changes: 64 additions & 0 deletions .github/workflows/sync-s3-media.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
---
name: Perform Media and Database Backups
on:
workflow_dispatch:
inputs:
environment:
required: true
type: string
workflow_call:
inputs:
environment:
required: true
type: string

jobs:
backup-media:
name: Perform Media Backups
runs-on: ubuntu-latest
environment: ${{ inputs.environment }}
env:
space: ${{ inputs.environment }}
steps:
- name: Backup media files
uses: cloud-gov/cg-cli-tools@main
with:
cf_username: ${{ secrets.CF_USERNAME }}
cf_password: ${{ secrets.CF_PASSWORD }}
cf_org: gsa-tts-oros-fac
cf_space: ${{ env.space }}
command: cf run-task gsa-fac -k 2G -m 2G --name s3_sync --command "./s3-sync.sh"

backup-dev-database:
if: ${{ inputs.environment == 'dev' }}
name: Perform Dev Database Backups
runs-on: ubuntu-latest
environment: ${{ inputs.environment }}
env:
space: ${{ inputs.environment }}
steps:
- name: Backup Dev Database
uses: cloud-gov/cg-cli-tools@main
with:
cf_username: ${{ secrets.CF_USERNAME }}
cf_password: ${{ secrets.CF_PASSWORD }}
cf_org: gsa-tts-oros-fac
cf_space: ${{ env.space }}
command: cf run-task gsa-fac -k 2G -m 2G --name pg_backup --command "./backup_database.sh ${{ env.space }}"

backup-prod-database:
if: ${{ inputs.environment == 'production' }}
name: Perform Prod Database Backups
runs-on: ubuntu-latest
environment: ${{ inputs.environment }}
env:
space: ${{ inputs.environment }}
steps:
- name: Backup the database (Prod Only)
uses: cloud-gov/cg-cli-tools@main
with:
cf_username: ${{ secrets.CF_USERNAME }}
cf_password: ${{ secrets.CF_PASSWORD }}
cf_org: gsa-tts-oros-fac
cf_space: ${{ env.space }}
command: cf run-task gsa-fac -k 2G -m 2G --name pg_backup --command "./backup_database.sh ${{ env.space }}"
30 changes: 30 additions & 0 deletions .github/workflows/tar-s3-media.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
---
name: Perform a tar snapshot of the media
on:
workflow_dispatch:
inputs:
environment:
required: true
type: string
workflow_call:
inputs:
environment:
required: true
type: string

jobs:
backup-media:
name: Perform Media Backups
runs-on: ubuntu-latest
environment: ${{ inputs.environment }}
env:
space: ${{ inputs.environment }}
steps:
- name: Backup media files
uses: cloud-gov/cg-cli-tools@main
with:
cf_username: ${{ secrets.CF_USERNAME }}
cf_password: ${{ secrets.CF_PASSWORD }}
cf_org: gsa-tts-oros-fac
cf_space: ${{ env.space }}
command: cf run-task gsa-fac -k 2G -m 2G --name s3_tar_snapshot --command "./s3-tar-snapshot.sh"
8 changes: 4 additions & 4 deletions backend/census_historical_migration/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ This is implemented as a Django app to leverage existing management commands and
- fac_s3.py - Uploads folders or files to an S3 bucket.

```bash
python manage.py fac_s3 fac-private-s3 --upload --src census_historical_migration/data
python manage.py fac_s3 gsa-fac-private-s3 --upload --src census_historical_migration/data
```

- csv_to_postgres.py - Inserts data into Postgres tables using the contents of the CSV files in the S3 bucket. The first row of each file is assumed to have the column names (we convert to lowercase). The name of the table is determined by examining the name of the file. The sample source files do not have delimters for empty fields at the end of a line - so we assume these are nulls.
Expand All @@ -44,16 +44,16 @@ python manage.py csv_to_postgres --clean True
1. Download test Census data from https://drive.google.com/drive/folders/1TY-7yWsMd8DsVEXvwrEe_oWW1iR2sGoy into census_historical_migration/data folder.
NOTE: Never check in the census_historical_migration/data folder into GitHub.

2. In the FAC/backend folder, run the following to load CSV files from census_historical_migration/data folder into fac-private-s3 bucket.
2. In the FAC/backend folder, run the following to load CSV files from census_historical_migration/data folder into gsa-fac-private-s3 bucket.

```bash
docker compose run --rm web python manage.py fac_s3 \
fac-private-s3 \
gsa-fac-private-s3 \
--upload \
--src census_historical_migration/data
```

3. In the FAC/backend folder, run the following to read the CSV files from fac-private-s3 bucket and load into Postgres.
3. In the FAC/backend folder, run the following to read the CSV files from gsa-fac-private-s3 bucket and load into Postgres.

```bash
docker compose run --rm web python manage.py \
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,9 @@
from .report_id_generator import (
xform_dbkey_to_report_id,
)

from ..transforms.xform_string_to_string import (
string_to_string,
)

logger = logging.getLogger(__name__)

Expand Down Expand Up @@ -79,6 +81,15 @@ def setup_sac(user, audit_header):
sac.auditee_certification = auditee_certification(audit_header)
sac.auditor_certification = auditor_certification(audit_header)
sac.data_source = settings.CENSUS_DATA_SOURCE

if general_info["user_provided_organization_type"] == "tribal":
suppression_code = string_to_string(audit_header.SUPPRESSION_CODE).upper()
sac.tribal_data_consent = {
"tribal_authorization_certifying_official_title": settings.GSA_MIGRATION,
"is_tribal_information_authorized_to_be_public": suppression_code != "IT",
"tribal_authorization_certifying_official_name": settings.GSA_MIGRATION,
}

sac.save()
logger.info("Created single audit checklist %s", sac)

Expand Down
Loading

0 comments on commit 909dc8a

Please sign in to comment.