-
Notifications
You must be signed in to change notification settings - Fork 2
Data removal
Before you get started, you'll need the following:
- a personal admin account for the production ID3C database
- a personal admin account for the testing ID3C database or access to the
postgres
user
Follow these steps to delete all data related to a single individual.
-
Login to the production ID3C database with admin account.
-
Find individual connected to provided sample or collection barcode.
- If the barcode is not linked to any encounter data, then move on to Steps for data removal of a sample
-
Set psql variable
individual
to theindividual.identifier
-
Check delete-individual.sql to ensure that it is up-to-date with ID3C
receiving
andwarehouse
schema.- Notice this does not delete records from
receiving.presence_absence
. The only identifier those results contain is the sample barcodes for this individual, and these are embedded alongside results for unrelated samples. In order to prevent re-processing the specific sample results later, the script removes the sample and collection identifiers from ourwarehouse.identifier
. The presence/absence ETL will ignore/skip such results.
- Notice this does not delete records from
-
Run delete-individual.sql via the
\include
psql meta-command. -
Verify the appropriate records have been deleted.
- If something doesn't look right, run
rollback;
to un-do all deletions.
- If something doesn't look right, run
-
Run
commit;
to commit the transaction and make all changes permanent. -
Repeat steps 2-7 on the testing ID3C database or refresh the testing database.
Follow these steps to delete all data related to a single sample that is not associated with any encounter data.
-
Login to the production ID3C database with admin account.
-
Find
sample.identifier
for given sample or collection barcode. -
Set psql variable
sample
to thesample.identifier
-
Check delete-sample.sql to ensure that it is up-to-date with ID3C
receiving
andwarehouse
schema.- Notice this does not delete records from
receiving.presence_absence
orreceiving.fhir
. The only identifier those results contain is the sample barcodes for this individual, and these are embedded alongside results for unrelated samples. In order to prevent re-processing the specific sample results later, the script removes the sample and collection identifiers from ourwarehouse.identifier
. The presence/absence ETL and FHIR ETL will ignore/skip such results.
- Notice this does not delete records from
-
Run delete-sample.sql via the
\include
psql meta-command. -
Verify the appropriate records have been deleted.
- If something doesn't look right, run
rollback;
to un-do all deletions.
- If something doesn't look right, run
-
Run
commit;
to commit the transaction and make all changes permanent. -
Repeat steps 2-7 on the testing ID3C database or refresh the testing database.
- Notify all devs to remove local copies of database.
- AWS backups of the database naturally expire within a month.
- Notify/check-in with upstream data sources (SCH, UW, BBI, NWGC)
- The upstream specimen manifest does not need to be changed because the sample and collection identifiers will be removed from our
warehouse.identifier
. The manifest ETL will ignore/skip such results.
- The upstream specimen manifest does not need to be changed because the sample and collection identifiers will be removed from our