Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not store output share in report aggregations. #1299

Merged
merged 3 commits into from
Apr 25, 2023
Merged

Conversation

branlwyd
Copy link
Contributor

This was never necessary: when an output share is recovered, its contribution to the relevant batch aggregation is always recorded, and the collection process solely uses the batch aggregation values. The output share was stored but never (meaningfully) read.

Historically, this is an artifact of the initial implementation of the aggregation process, which occurred before the collection process was written. At time of initial implementation, there was no batch aggregation to record aggregated values into.

Closes #1284.

This was never necessary: when an output share is recovered, its
contribution to the relevant batch aggregation is always recorded, and
the collection process solely uses the batch aggregation values. The
output share was stored but never (meaningfully) read.

Historically, this is an artifact of the initial implementation of the
aggregation process, which occurred before the collection process was
written. At time of initial implementation, there was no batch
aggregation to record aggregated values into.
@branlwyd branlwyd requested a review from a team as a code owner April 24, 2023 23:12
@branlwyd
Copy link
Contributor Author

I dropped some code which would attempt to accumulate any aggregations that Finished during initialization, because it is impossible for an aggregation to finish during initialization (i.e. this was effectively dead code).

I'll need to fix this in a smarter way in #1234, since the new aggregation scheme does allow aggregation to complete during initialization.

aggregator_core/src/datastore.rs Outdated Show resolved Hide resolved
@@ -0,0 +1 @@
ALTER TABLE report_aggregations DROP COLUMN out_share;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the migration plan here? I think there's no data in staging-dap-04's report_aggregations table, but will Janus start throwing errors if we deploy this migration ahead of deploying a new Janus? Or maybe it won't since staging-dap-04 is idle?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the current plan is to accept the errors, since as you say staging-dap-04 is effectively unused for the time being. Our current migrations effectively exercise the migration infra, but may not fulfill all the requirements of a production migration -- for example, 20230417204528 would have required a data migration/backfill if there were any data to be migrated/backfilled. (one might argue we should eventually yank releases that can't be migrated to higher release versions, except that I'm not sure anyone currently cares.)

That said, maybe that's not the right choice? What we really want is for this migration to run after the code in this PR is fully deployed. There are a couple of ways we could engineer this:

  • Implement & document a way to split migrations into "pre-release" and "post-release" migrations, and phrase this as a "post-release" migration.
  • Just split the data migration part of the change to a separate PR, and manually wrangle it to be released with the next release after the code changes in this PR.

WDYT? I'd like the ability to engineer data migrations to be pre- or post-release, but this might be too much work immediately. Manually wrangling each data change that needs to happen post-release so that it is included in a later release is easy to choose as a policy, but finicky & somewhat error-prone.

For now, my vote would be "accept whatever errors occur during the release process" since there is no practical downside to doing this, but I'm interested in your thoughts -- especially as to whether pre-/post-migrations are a good idea/feasible to implement.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm comfortable with temporarily breaking staging-dap-04 while we deploy this migration, because nobody is currently relying on that deployment and thus we shouldn't take on extra work to maintain its availability.

But suppose we were doing this in a live production environment, as a thought exercise: I think what we'd do is this:

  • Make a change to Janus that adds the migration script and makes code changes so that Janus can work with or without the migration applied (we can have confidence of this because our tests run against multiple schema versions)
  • Deploy the new Janus
  • Apply the migration
  • Make another change to Janus that removes support for the old schema version

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need any additional features to differentiate such migrations, rather, more complicated changes should be broken up into multiple deployable changes, and the sequencing of database migrations and code deploys will be determined on a case-by-case basis, then enforced by SUPPORTED_SCHEMA_VERSIONS checks at runtime and in tests. This change is uniquely simple, since we're deleting a write-only column. If we were deleting a non-null column that we were both reading and writing, the sequence could look something like:

  • Existing state: schema version 0 is deployed, and the deployed code only supports schema version 0
  • Code deploy: stop reading from column, supports schema versions 0 and 1
  • Schema migration, 0 => 1: make column nullable
  • Code deploy: stop writing to column, supports schema versions 0, 1, and 2
  • Schema migration, 1 => 2: delete column

@branlwyd branlwyd requested a review from tgeoghegan April 25, 2023 18:27
Copy link
Contributor

@tgeoghegan tgeoghegan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change LGTM, assuming tests pass.

@branlwyd branlwyd merged commit c839ef0 into main Apr 25, 2023
@branlwyd branlwyd deleted the bran/drop-out-share branch April 25, 2023 19:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Space optimization: don't store output shares
3 participants