Do not store output share in report aggregations. #1299

branlwyd · 2023-04-24T23:12:27Z

This was never necessary: when an output share is recovered, its contribution to the relevant batch aggregation is always recorded, and the collection process solely uses the batch aggregation values. The output share was stored but never (meaningfully) read.

Historically, this is an artifact of the initial implementation of the aggregation process, which occurred before the collection process was written. At time of initial implementation, there was no batch aggregation to record aggregated values into.

Closes #1284.

This was never necessary: when an output share is recovered, its contribution to the relevant batch aggregation is always recorded, and the collection process solely uses the batch aggregation values. The output share was stored but never (meaningfully) read. Historically, this is an artifact of the initial implementation of the aggregation process, which occurred before the collection process was written. At time of initial implementation, there was no batch aggregation to record aggregated values into.

branlwyd · 2023-04-24T23:20:38Z

I dropped some code which would attempt to accumulate any aggregations that Finished during initialization, because it is impossible for an aggregation to finish during initialization (i.e. this was effectively dead code).

I'll need to fix this in a smarter way in #1234, since the new aggregation scheme does allow aggregation to complete during initialization.

aggregator_core/src/datastore.rs

tgeoghegan · 2023-04-25T00:22:27Z

db/20230424220336_rm-out-share.up.sql

@@ -0,0 +1 @@
+ALTER TABLE report_aggregations DROP COLUMN out_share;


What's the migration plan here? I think there's no data in staging-dap-04's report_aggregations table, but will Janus start throwing errors if we deploy this migration ahead of deploying a new Janus? Or maybe it won't since staging-dap-04 is idle?

Well, the current plan is to accept the errors, since as you say staging-dap-04 is effectively unused for the time being. Our current migrations effectively exercise the migration infra, but may not fulfill all the requirements of a production migration -- for example, 20230417204528 would have required a data migration/backfill if there were any data to be migrated/backfilled. (one might argue we should eventually yank releases that can't be migrated to higher release versions, except that I'm not sure anyone currently cares.)

That said, maybe that's not the right choice? What we really want is for this migration to run after the code in this PR is fully deployed. There are a couple of ways we could engineer this:

Implement & document a way to split migrations into "pre-release" and "post-release" migrations, and phrase this as a "post-release" migration.

Just split the data migration part of the change to a separate PR, and manually wrangle it to be released with the next release after the code changes in this PR.

WDYT? I'd like the ability to engineer data migrations to be pre- or post-release, but this might be too much work immediately. Manually wrangling each data change that needs to happen post-release so that it is included in a later release is easy to choose as a policy, but finicky & somewhat error-prone.

For now, my vote would be "accept whatever errors occur during the release process" since there is no practical downside to doing this, but I'm interested in your thoughts -- especially as to whether pre-/post-migrations are a good idea/feasible to implement.

I'm comfortable with temporarily breaking staging-dap-04 while we deploy this migration, because nobody is currently relying on that deployment and thus we shouldn't take on extra work to maintain its availability.

But suppose we were doing this in a live production environment, as a thought exercise: I think what we'd do is this:

Make a change to Janus that adds the migration script and makes code changes so that Janus can work with or without the migration applied (we can have confidence of this because our tests run against multiple schema versions)

Deploy the new Janus

Apply the migration

Make another change to Janus that removes support for the old schema version

I don't think we need any additional features to differentiate such migrations, rather, more complicated changes should be broken up into multiple deployable changes, and the sequencing of database migrations and code deploys will be determined on a case-by-case basis, then enforced by SUPPORTED_SCHEMA_VERSIONS checks at runtime and in tests. This change is uniquely simple, since we're deleting a write-only column. If we were deleting a non-null column that we were both reading and writing, the sequence could look something like:

Existing state: schema version 0 is deployed, and the deployed code only supports schema version 0

Code deploy: stop reading from column, supports schema versions 0 and 1

Schema migration, 0 => 1: make column nullable

Code deploy: stop writing to column, supports schema versions 0, 1, and 2

Schema migration, 1 => 2: delete column

tgeoghegan

Change LGTM, assuming tests pass.

branlwyd requested a review from a team as a code owner April 24, 2023 23:12

newlines

2ffa664

tgeoghegan approved these changes Apr 25, 2023

View reviewed changes

Merge branch 'main' into bran/drop-out-share

96822c7

branlwyd requested a review from tgeoghegan April 25, 2023 18:27

tgeoghegan approved these changes Apr 25, 2023

View reviewed changes

branlwyd merged commit c839ef0 into main Apr 25, 2023

branlwyd deleted the bran/drop-out-share branch April 25, 2023 19:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do not store output share in report aggregations. #1299

Do not store output share in report aggregations. #1299

branlwyd commented Apr 24, 2023

branlwyd commented Apr 24, 2023

tgeoghegan Apr 25, 2023

branlwyd Apr 25, 2023

tgeoghegan Apr 25, 2023

divergentdave Apr 25, 2023

tgeoghegan left a comment

		@@ -0,0 +1 @@
		ALTER TABLE report_aggregations DROP COLUMN out_share;

Do not store output share in report aggregations. #1299

Do not store output share in report aggregations. #1299

Conversation

branlwyd commented Apr 24, 2023

branlwyd commented Apr 24, 2023

tgeoghegan Apr 25, 2023

Choose a reason for hiding this comment

branlwyd Apr 25, 2023

Choose a reason for hiding this comment

tgeoghegan Apr 25, 2023

Choose a reason for hiding this comment

divergentdave Apr 25, 2023

Choose a reason for hiding this comment

tgeoghegan left a comment

Choose a reason for hiding this comment