Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

📍 [Epic] Migration data loading: final push #4240

Closed
6 of 8 tasks
danswick opened this issue Aug 28, 2024 · 12 comments
Closed
6 of 8 tasks

📍 [Epic] Migration data loading: final push #4240

danswick opened this issue Aug 28, 2024 · 12 comments
Assignees

Comments

@danswick
Copy link
Contributor

danswick commented Aug 28, 2024

What problems would we like to solve?

  1. Some historical data is still missing from our public dissemination data.
  2. All migration records (information about how we interpreted data while migrating) have yet to be loaded into a production database.
  3. SF-SAC data cannot be loaded using the same techniques we used for loading dissemination data. It's not clear how we can do so while maintaining a clean foreign key relationship between the user table and the other SF-SAC tables while the database is live and submissions are being added.

How do we know we’re done?

  1. Data from each of the data categories described below has been loaded into production and can be verified using verification methods to be determined and documented.
  2. Scripts and other data loading processes are documented in a single place and can be replicated later if needed.

Who will work on this epic?

@gsa-suk @rocheller123 @sambodeme @jadudm

Where are we now?

This table describes the status of each category of data-that-needs-loading.

Field Count Load method Status Date completed
singleauditchecklists (and friends) ~3,000 Utility ✅ In prod Late June ‘24
dissemination_ ~3,000 Shell script ✅ In prod Late June ‘24
singleauditchecklists (and friends) ~300 Utility Backed up to Drive and need to be loaded. These are leftovers from the ~3,000 batch. ✅ In prod 12/10/24
dissemination_ ~300 Shell script Backed up to Drive and need to be loaded. These are leftovers from the ~3,000 batch. ✅ In Prod 09/09/24
dissemination_ ~277,000 ✅ In prod Late Jan ‘24
singleauditchecklists (and friends) ~277,000 Shell script Backed up to S3 and need to be loaded. ✅ In prod 12/10/24
migrationstatus ~280,000 Shell script Backed up to drive and need to be loaded. ✅ In Prod 09/09/24
Historic_ tables ? ? Backed up to S3 and need to be loaded. These are the Census tables. ✅ In prod 09/12/24
PDF-only 212 ? Needs decision on how to handle. These are leftover reports that are just PDFs with no SF-SAC data.

Links! Tickets, documents, repos, etc. Things we’ve used to track work in recent months:

What needs to happen next

This list divides the work up into three categories: housekeeping, SF-SAC strategy and loading, and loading everything else. The project team should break the work up into more specific/detailed tickets if needed.

Next steps

Preview Give feedback
  1. 9 of 9
    eng
    gsa-suk rocheller123
    sambodeme

Notes previously: https://docs.google.com/document/d/1wC8PC3_VeAz09-msIL_uIRO9a3tzz1nOgqnbMdWoczE/edit

@github-project-automation github-project-automation bot moved this to Triage in FAC Aug 28, 2024
@danswick danswick moved this from Triage to In Progress in FAC Aug 28, 2024
@gsa-suk
Copy link
Contributor

gsa-suk commented Sep 9, 2024

09/09/24 - Load 318 dissemination data and migration status to Prod (Sudha, Hassan, Rochelle, Matt)
#4266

@gsa-suk
Copy link
Contributor

gsa-suk commented Sep 12, 2024

09/12/24 - Load census historical tables to Prod - (Sudha, Matt, Rochelle)

#4279

@danswick
Copy link
Contributor Author

  • Testing should be completed in the next couple of days.
  • Need to determine communications plan: how long to give notice after testing is complete.

@gsa-suk
Copy link
Contributor

gsa-suk commented Oct 21, 2024

@gsa-suk
Copy link
Contributor

gsa-suk commented Oct 21, 2024

Testing SAC load procedure with Prod data loaded into Staging:

  1. Updated audit_singleauditcheklist with user id from prod auth_user.
  2. Loaded 275K + 316 historic sacs to prod table in Staging.
  3. Working on exporting audit_singleauditchecklist with 330+K audits from Staging to the GFE.

@gsa-suk
Copy link
Contributor

gsa-suk commented Oct 22, 2024

10/22/24 -

  1. Copied 275K+316 reportfile and access data to Staging S3.
  2. Loaded 275K+316 reportfile and access to Staging.
  3. Ran tests to verify loaded data in Staging.

@gsa-suk
Copy link
Contributor

gsa-suk commented Oct 25, 2024

10/25/24 -

Tested 'Maintenance mode toggle' in Staging. This worked well. When maintenance mode was turned on, sac could not be created. When maintenance mode was turned off, sac could be created.

@gsa-suk
Copy link
Contributor

gsa-suk commented Oct 30, 2024

10/29/24 -

Copied ~4.5 G sac data files to Prod S3. #4421

@danswick danswick mentioned this issue Nov 26, 2024
@danswick
Copy link
Contributor Author

SAC data load SSH record: #4527

@Leighdiddy
Copy link

Closing as completed on 12/10/24

@github-project-automation github-project-automation bot moved this from In Progress to Done in FAC Dec 11, 2024
@Leighdiddy Leighdiddy moved this from Done to In Progress in FAC Dec 11, 2024
@gsa-suk gsa-suk reopened this Dec 11, 2024
@danswick
Copy link
Contributor Author

@Leighdiddy this isn't quite complete yet. There are still a couple of smaller tables that need to be loaded.

@danswick
Copy link
Contributor Author

Following #4540, this is done!

@github-project-automation github-project-automation bot moved this from In Progress to Done in FAC Dec 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

No branches or pull requests

5 participants