You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For fixed size queries, we prompt the user to specify a max batch size. We'll probably always recommend that the user set min and max equal to each other, thus making this field meaningless.
Always set min_batch_size == max_batch_size. This is a simple approach that will work pretty well, and always provide equally-sized batches, which the Collector might appreciate.
Set max_batch_size to a small multiple of min_batch_size, perhaps 110% of the min_batch_size. This would allow a small fraction of reports inside the batch to fail aggregation without requiring an extra "tiny" aggregation job after all of the other aggregation jobs; this would decrease the number of aggregation jobs that our system needs to handle & would (slightly) reduce latency-to-aggregate-availability in the case that reports do fail to aggregate.
In previous discussions, I was leaning towards the former, but now IMO I think we should take the max_batch_size = 110% of min_batch_size approach. (I'm curious if any collectors would care about receiving same-sized batches -- I think the answer is yes for at least some collectors, since this was one of the reasons that fixed-size was invented in the first place, but I do not know how common this requirement will be.)
In any case, I think we don't need to expose max_batch_size as a user-controllable parameter.
In situations where Clients add DP noise as a preprocessing step before report sharding, I think people may want min_batch_size == max_batch_size. That will make calibrating the per-Client noise easier.
This may be a good case for providing one-size-fits-all behavior via the web console, and allowing more sophisticated choices for tasks created via API.
From #286 (comment).
For fixed size queries, we prompt the user to specify a max batch size. We'll probably always recommend that the user set min and max equal to each other, thus making this field meaningless.
There's also an unspecified issue in the DAP specification for removing this ability entirely https://www.ietf.org/archive/id/draft-ietf-ppm-dap-06.html#section-4.1.2-6
The text was updated successfully, but these errors were encountered: