Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sk ez pp/hist data load #2730

Merged
merged 65 commits into from
Nov 9, 2023
Merged

Sk ez pp/hist data load #2730

merged 65 commits into from
Nov 9, 2023

Conversation

gsa-suk
Copy link
Contributor

@gsa-suk gsa-suk commented Nov 3, 2023

#2718

census_to_gsafac application containing two management commands

  1. fac_s3 command to load census csv files from a local data folder to fac-census-to-gsafac-s3 bucket.
  2. csv_to_postgres command to load data from csv files in fac-census-to-gsafac-s3/data bucket into postgres database.

README file contains details about how to run the commands.

Upon completion of the commands, the following messages will be displayed in the console:
Screenshot 2023-11-07 at 1 11 30 PM

PR checklist: submitters

  • Link to an issue if possible. If there’s no issue, describe what your branch does. Even if there is an issue, a brief description in the PR is still useful.
  • List any special steps reviewers have to follow to test the PR. For example, adding a local environment variable, creating a local test file, etc.
  • For extra credit, submit a screen recording like this one.
  • Make sure you’ve merged main into your branch shortly before creating the PR. (You should also be merging main into your branch regularly during development.)
  • Make sure you’ve accounted for any migrations. When you’re about to create the PR, bring up the application locally and then run git status | grep migrations. If there are any results, you probably need to add them to the branch for the PR. Your PR should have only one new migration file for each of the component apps, except in rare circumstances; you may need to delete some and re-run python manage.py makemigrations to reduce the number to one. (Also, unless in exceptional circumstances, your PR should not delete any migration files.)
  • Make sure that whatever feature you’re adding has tests that cover the feature. This includes test coverage to make sure that the previous workflow still works, if applicable.
  • Make sure the full-submission.cy.js Cypress test passes, if applicable.
  • Do manual testing locally. Our tests are not good enough yet to allow us to skip this step. If that’s not applicable for some reason, check this box.
  • Verify that no Git surgery was necessary, or, if it was necessary at any point, repeat the testing after it’s finished.
  • Once a PR is merged, keep an eye on it until it’s deployed to dev, and do enough testing on dev to verify that it deployed successfully, the feature works as expected, and the happy path for the broad feature area (such as submission) still works.

PR checklist: reviewers

  • Pull the branch to your local environment and run make docker-clean; make docker-first-run && docker compose up; then run docker compose exec web /bin/bash -c "python manage.py test"
  • Manually test out the changes locally, or check this box to verify that it wasn’t applicable in this case.
  • Check that the PR has appropriate tests. Look out for changes in HTML/JS/JSON Schema logic that may need to be captured in Python tests even though the logic isn’t in Python.
  • Verify that no Git surgery is necessary at any point (such as during a merge party), or, if it was, repeat the testing after it’s finished.

The larger the PR, the stricter we should be about these points.

Copy link
Contributor

github-actions bot commented Nov 3, 2023

Terraform plan for meta

No changes. Your infrastructure matches the configuration.
No changes. Your infrastructure matches the configuration.

Terraform has compared your real infrastructure against your configuration
and found no differences, so no changes are needed.

Warning: Argument is deprecated

  with module.s3-backups.cloudfoundry_service_instance.bucket,
  on /tmp/terraform-data-dir/modules/s3-backups/s3/main.tf line 14, in resource "cloudfoundry_service_instance" "bucket":
  14:   recursive_delete = var.recursive_delete

Since CF API v3, recursive delete is always done on the cloudcontroller side.
This will be removed in future releases

✅ Plan applied in Deploy to Development and Management Environment #350

Copy link
Contributor

github-actions bot commented Nov 3, 2023

Terraform plan for dev

No changes. Your infrastructure matches the configuration.
No changes. Your infrastructure matches the configuration.

Terraform has compared your real infrastructure against your configuration
and found no differences, so no changes are needed.

Warning: Argument is deprecated

  with module.dev.module.database.cloudfoundry_service_instance.rds,
  on /tmp/terraform-data-dir/modules/dev.database/database/main.tf line 14, in resource "cloudfoundry_service_instance" "rds":
  14:   recursive_delete = var.recursive_delete

Since CF API v3, recursive delete is always done on the cloudcontroller side.
This will be removed in future releases

(and 2 more similar warnings elsewhere)

✅ Plan applied in Deploy to Development and Management Environment #350

@gsa-suk gsa-suk requested a review from jadudm November 3, 2023 22:37
Copy link
Contributor

github-actions bot commented Nov 3, 2023

File Coverage Missing
All files 86%
api/serializers.py 88% 177-178 183 188
api/test_views.py 95% 103
api/uei.py 88% 87 118-119 163 167-168
api/views.py 98% 195-196 334-335
audit/file_downloads.py 73% 35-53 81-83
audit/forms.py 47% 22-29 142-149
audit/intake_to_dissemination.py 92% 67-68 201-207 257
audit/models.py 84% 58 60 65 67 214 226-229 247 420 438-439 447 469 558-559 563 571 580 586
audit/test_commands.py 87%
audit/test_mixins.py 90% 112-113 117-119 184-185 189-191
audit/test_validators.py 95% 436 440 608-609 848 855 862 869
audit/test_views.py 95% 410-442 451-482 491-519
audit/test_workbooks_should_fail.py 88% 56 83-84 88
audit/test_workbooks_should_pass.py 90% 56 81
audit/utils.py 70% 13 21 33-35 38
audit/validators.py 92% 137 189 283-292 299-308 486-490 495-499 515-524
audit/views.py 31% 90-111 134-135 209-210 255-256 267-268 270-274 321-334 337-351 356-369 386-392 397-417 420-448 453-482 485-529 534-554 557-585 590-619 622-666 671-683 686-696 701-713 740-741 746-795 798-838 841-858
audit/cross_validation/additional_ueis.py 93% 33
audit/cross_validation/check_award_ref_declaration.py 90%
audit/cross_validation/check_award_reference_uniqueness.py 93%
audit/cross_validation/check_certifying_contacts.py 87%
audit/cross_validation/check_findings_count_consistency.py 91%
audit/cross_validation/check_ref_number_in_cap.py 90%
audit/cross_validation/check_ref_number_in_findings_text.py 90%
audit/cross_validation/errors.py 78% 30 69
audit/cross_validation/naming.py 93% 182
audit/cross_validation/submission_progress_check.py 95% 79
audit/cross_validation/tribal_data_sharing_consent.py 81% 33 36 40
audit/cross_validation/validate_general_information.py 93% 28-29
audit/fixtures/single_audit_checklist.py 55% 146-183 229-238
audit/intakelib/exceptions.py 71% 7-9 12
audit/intakelib/intermediate_representation.py 91% 27-28 73 91 129 162 200-203 212-213
audit/intakelib/mapping_audit_findings.py 97% 53
audit/intakelib/mapping_audit_findings_text.py 97% 52
audit/intakelib/mapping_federal_awards.py 93% 95
audit/intakelib/mapping_util.py 81% 21 25 29 99 104-105 114-120 130 145 150
audit/intakelib/checks/check_all_unique_award_numbers.py 79% 24
audit/intakelib/checks/check_cardinality_of_passthrough_names_and_ids.py 91%
audit/intakelib/checks/check_cluster_total.py 85% 49 65
audit/intakelib/checks/check_finding_prior_references_pattern.py 73% 33 43-44
audit/intakelib/checks/check_findings_grid_validation.py 84% 58
audit/intakelib/checks/check_has_all_the_named_ranges.py 84% 52
audit/intakelib/checks/check_is_a_workbook.py 69% 20
audit/intakelib/checks/check_loan_balance_entries.py 78% 22 39-40
audit/intakelib/checks/check_loan_balance_present.py 76% 27 36
audit/intakelib/checks/check_look_for_empty_rows.py 91% 18
audit/intakelib/checks/check_no_major_program_no_type.py 76% 18 27
audit/intakelib/checks/check_no_repeat_findings.py 76% 21 30
audit/intakelib/checks/check_other_cluster_names.py 81% 24 34
audit/intakelib/checks/check_passthrough_name_when_no_direct.py 88% 9 47
audit/intakelib/checks/check_sequential_award_numbers.py 76% 14 22
audit/intakelib/checks/check_show_ir.py 70% 8-14
audit/intakelib/checks/check_start_and_end_rows_of_all_columns_are_same.py 89% 14
audit/intakelib/checks/check_state_cluster_names.py 65% 23-24 34
audit/intakelib/checks/check_version_number.py 73% 21 31-32
audit/intakelib/checks/runners.py 95% 129
audit/intakelib/common/util.py 89% 21 38
audit/intakelib/transforms/xform_reformat_prior_references.py 55% 12-17
audit/intakelib/transforms/xform_rename_additional_notes_sheet.py 81% 14
audit/management/commands/load_fixtures.py 46% 39-45
audit/viewlib/submission_progress_view.py 89% 111 171-172
audit/viewlib/tribal_data_consent.py 34% 23-41 44-79
audit/viewlib/unlock_after_certification.py 57% 28-47 69-83
audit/viewlib/upload_report_view.py 26% 32-35 44 91-117 120-170 178-209
config/urls.py 71% 83
dissemination/models.py 99% 460
dissemination/search.py 51% 19 69-97 104-114
dissemination/views.py 62% 29-31 34-92 159 161 163
dissemination/migrations/0002_general_fac_accepted_date.py 47% 10-12
djangooidc/backends.py 78% 32 57-63
djangooidc/exceptions.py 66% 19 21 23 28
djangooidc/oidc.py 16% 32-35 45-51 64-70 92-149 153-199 203-226 230-275 280-281 286
djangooidc/views.py 80% 22 43 114
djangooidc/tests/common.py 96%
report_submission/forms.py 92% 35
report_submission/views.py 76% 83 215-216 218 240-241 260-261 287-396 399-409
report_submission/templatetags/get_attr.py 76% 8 11-14 18
support/admin.py 88% 76 79 84 91-97 100-102
support/cog_over.py 91% 30-33 93 145
support/models.py 89% 103-104
support/test_cog_over.py 98% 134-135 224
support/management/commands/seed_cog_baseline.py 98% 20-21
tools/update_program_data.py 89% 96
users/auth.py 95% 40-41
users/models.py 97% 51-52
users/fixtures/user_fixtures.py 91%

Minimum allowed coverage is 85%

Generated by 🐒 cobertura-action against 7396ca8

@gsa-suk gsa-suk marked this pull request as ready for review November 8, 2023 18:15
Co-authored-by: Purvin Patel <[email protected]>
@gsa-suk gsa-suk added this pull request to the merge queue Nov 9, 2023
Merged via the queue into main with commit 59f35f5 Nov 9, 2023
13 checks passed
@gsa-suk gsa-suk deleted the sk_ez_pp/hist_data_load branch November 9, 2023 00:49
tadhg-ohiggins pushed a commit that referenced this pull request Nov 9, 2023
* Initial commit

* Initial commit

* Initial commit

* Initial commit

* Initial commit

* Initial commit

* Initial commit

* Initial commit

* Removed comments

* Added census_to_gsafac

* Initial commit

* Initial commit

* Initial commit

* Initial commit

* Initial commit

* Added procedure to load test Census data to postgres

* Excluding workbook loader

* Excluding workbook loader

* Excluding load_raw

* Updates

* Added c2g-db

* Added c2g-db

* Replaced c2g with census_to_gsafac, renamed raw_to_pg.py as csv_to_postgres.py

* Replaced c2g with census_to_gsafac, renamed raw_to_pg.py as csv_to_postgres.py

* Replaced c2g with census_to_gsafac, renamed raw_to_pg.py as csv_to_postgres.py

* Replaced c2g with census_to_gsafac, renamed raw_to_pg.py as csv_to_postgres.py

* Replaced c2g with census_to_gsafac, renamed raw_to_pg.py as csv_to_postgres.py

* Replaced c2g with census_to_gsafac, renamed raw_to_pg.py as csv_to_postgres.py

* Replaced c2g with census_to_gsafac, renamed raw_to_pg.py as csv_to_postgres.py

* Apply suggestions from code review

Co-authored-by: Hassan D. M. Sambo <[email protected]>

* Added census-to-gsafac database

* Replaced c2g with census-to-gsafac

* Fix linting

* Fix linting

* Fix linting

* Fix linting

* Fix linting

* Reformatted with black

* Reformatted with black

* Reformatted with black

* Updated S3 bucket name and filename

* Updated S3 bucket name and filename

* Updates

* Consolidated census_to_gsafac and census_historical_migration apps

* Django migration

* Telling mypy to ignore django migration files

* Linting

* Incorporated chunking capabilities from Alternative suggestion for loading data from S3 #2660

* Incorporated chunking capabilities from Alternative suggestion for loading data from S3 #2660

* Moving fac_s3.py to support/management/commands/

* Moving fac_s3.py to support/management/commands/

* Added load_data function

* Tested load_data

* Removed import botocore

* Removed import botocore

* refactored csv_to_postgres.py

Co-authored-by: Purvin Patel <[email protected]>

* added chunk-size arguments

* added help comments for load_data

* Code cleaning

* Renamed chunk-size to chunksize

* Added chunksize argument

---------

Co-authored-by: SudhaUKumar <[email protected]>
Co-authored-by: Purvin Patel <[email protected]>
Co-authored-by: Hassan D. M. Sambo <[email protected]>
Co-authored-by: Edward Zapata <[email protected]>
Co-authored-by: Purvin Patel <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants