make cost surface data migration idempotent [MRXN23-478] #1540

hotzevzl · 2023-10-06T17:13:22Z

For context: https://vizzuality.atlassian.net/browse/MRXN23-478

vercel · 2023-10-06T17:13:27Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
marxan	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Dec 6, 2023 2:44pm

hotzevzl · 2023-10-06T17:19:58Z

just an initial draft for now, right after running the data migration in my local dev env and on marxan23:

I think that using the scenario_id of an existing scenario as id of the corresponding cost surface created by copying over the scenario's current scenarios_pu_cost_data records should not do any harm, and actually can help us guarantee that if we were to re-run the data migration script again (with some further tweaks over the current draft of my changes: we'd need to turn some inserts into upserts), we would not scatter new cost surfaces all over the place (as we would if using gen_random_uuid()) but we'd "just" copy over the scenarios_pu_cost_data costs again
likewise for creating (or upserting, in a later commit) default cost surfaces using the project's project_id as the cost surface's id

Both these things should maybe handled by using a function to pick a UUIDv4 as a sort of hash (sort of what can be done in Java with this library method: https://stackoverflow.com/a/29059595), but I don't think it's worth the effort (of finding a Python equivalent and using it) over "just" syncing the ids

hotzevzl · 2023-10-06T17:23:15Z

The other thing I added in this initial draft is to avoid deleting orphaned scenarios - yes, scenarios with cost data in such a bad shape (=likely not even existing, for whatever reason) are going to be useless in practice, but thinking about doing this in staging and production, I think that by keeping around any scenarios with messed-up data, we wouldn't make things worse than they are, and on the other hand we get a chance to inspect and see what's up.

I've seen one such scenario when doing the data migration on m23 and that was a legacy project yadda yadda, plus there were 4 default cost surfaces for which we didn't have min/max data, which was likely because of some mess-up of the project grid (no PUs? some bug in some specific way of creating the grid so that no project_pus were created? I'll need to check more carefully next week).

… cost surface Avoid automatically deleting the scenario: we log the anomalous situation and arbitrarily set the cost surface's range to [0,0]. In practice, the scenario that uses such cost surface may have contained invalid or incomplete data to start with, so setting an arbitrary range may not really matter here (as in, it should not make things worse).

vercel bot deployed to Preview October 6, 2023 17:14 View deployment

hotzevzl requested review from KevSanchez and alexeh October 6, 2023 17:20

vercel bot deployed to Preview October 6, 2023 17:27 View deployment

hotzevzl added 3 commits December 6, 2023 14:41

use deterministic ids to help make the process idempotent

b28484e

add requirements.txt for cost surfaces data migration script

8111fd0

hotzevzl force-pushed the chore/data/MRXN23-478_make-cost-surface-data-migration-idempotent branch from fb78313 to 8111fd0 Compare December 6, 2023 14:42

vercel bot deployed to Preview December 6, 2023 14:44 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

make cost surface data migration idempotent [MRXN23-478] #1540

make cost surface data migration idempotent [MRXN23-478] #1540

hotzevzl commented Oct 6, 2023

vercel bot commented Oct 6, 2023 •

edited

Loading

hotzevzl commented Oct 6, 2023

hotzevzl commented Oct 6, 2023

make cost surface data migration idempotent [MRXN23-478] #1540

Are you sure you want to change the base?

make cost surface data migration idempotent [MRXN23-478] #1540

Conversation

hotzevzl commented Oct 6, 2023

vercel bot commented Oct 6, 2023 • edited Loading

hotzevzl commented Oct 6, 2023

hotzevzl commented Oct 6, 2023

vercel bot commented Oct 6, 2023 •

edited

Loading