Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[0.6] Aggregation: repeat same request on retries. #2556

Merged
merged 8 commits into from
Feb 5, 2024

Conversation

branlwyd
Copy link
Contributor

@branlwyd branlwyd commented Jan 27, 2024

This is implemented by copying the necessary report information into the report aggregations at time of aggregation job creation. This means that client-report GC will not change the request that is made.

I plan to follow this up with a PR to move report scrubbing from the end of the first aggregation step to aggregation job creation. This would have been included in this PR, but this PR is already quite large.

See individual commits for more detail in some cases--aggregation job creation & driving test cases are enhanced somewhat.

Backport of #2516.

@branlwyd branlwyd requested a review from a team as a code owner January 27, 2024 01:06
@branlwyd
Copy link
Contributor Author

This should not be merged until #2555 is released.

Part of #2442.

@branlwyd branlwyd force-pushed the bran/0.6-agg-job-same-request-db branch from 9df2ee2 to 2bb030b Compare January 29, 2024 18:10
@branlwyd branlwyd force-pushed the bran/0.6-agg-job-same-request branch 2 times, most recently from 99f9cc4 to 1e8c518 Compare January 29, 2024 18:22
branlwyd and others added 3 commits January 29, 2024 10:37
This is implemented by copying the necessary report information into the report aggregations at time of aggregation job creation. This means that client-report GC will not change the request that is made.
@branlwyd branlwyd force-pushed the bran/0.6-agg-job-same-request branch from 1e8c518 to 35afcf0 Compare January 29, 2024 18:38
@branlwyd
Copy link
Contributor Author

(This is rebased & ready for review.)

@@ -99,7 +99,7 @@ macro_rules! supported_schema_versions {
// version is seen, [`Datastore::new`] fails.
//
// Note that the latest supported version must be first in the list.
supported_schema_versions!(3, 2, 1);
supported_schema_versions!(3);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'll want to discuss in release notes that deploying whichever Janus version this PR goes out in is not backward compatible and that operators will want to work out a DB migration plan.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏻, but as you mentioned in https://github.com/divviup/janus-ops/issues/1257, the DB migration is actually related to #2555 which is going out in this week's release. This PR will go out with the next release, once that migration has stabilized.

Base automatically changed from bran/0.6-agg-job-same-request-db to release/0.6 January 30, 2024 17:24
Copy link
Collaborator

@divergentdave divergentdave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. My biggest concern here is the upgrade path. The unit tests appear to exercise it, but it might be good to de-risk it further by running a heterogeneous load test, using this week's release image as the aggregation job creator, and this PR's code for the aggregation job driver.

Copy link
Contributor Author

@branlwyd branlwyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. My biggest concern here is the upgrade path. The unit tests appear to exercise it, but it might be good to de-risk it further by running a heterogeneous load test, using this week's release image as the aggregation job creator, and this PR's code for the aggregation job driver.

Great idea -- I'll set up a local load test to do just that before merging this PR.

@branlwyd
Copy link
Contributor Author

branlwyd commented Feb 5, 2024

LGTM. My biggest concern here is the upgrade path. The unit tests appear to exercise it, but it might be good to de-risk it further by running a heterogeneous load test, using this week's release image as the aggregation job creator, and this PR's code for the aggregation job driver.

I ran a local load test with the suggested configuration for about an hour, and I can see collection jobs completing with the expected values. I think this PR is about as good as it's gonna get, I'm going to go ahead and merge.

@branlwyd branlwyd merged commit f3bab98 into release/0.6 Feb 5, 2024
8 checks passed
@branlwyd branlwyd deleted the bran/0.6-agg-job-same-request branch February 5, 2024 21:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants