Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ping-pong topology fixes #281

Merged
merged 1 commit into from
Aug 30, 2023
Merged

Conversation

tgeoghegan
Copy link
Contributor

@tgeoghegan tgeoghegan commented Aug 3, 2023

A handful of fixes to the specification of the ping-pong aggregator
communication topology encountered during implementation.

  • Explicitly describe Python objects for ping-pong State
  • ping_pong_leader_init and _helper_init return Message, not bytes
  • ping_pong_transition always returns a Message
  • typo in ping_pong_transition
  • ping_pong_continued fails if state is not Continued
  • remove unused is_leader parameter from ping_pong_transition

@tgeoghegan tgeoghegan marked this pull request as draft August 3, 2023 23:26
@tgeoghegan
Copy link
Contributor Author

Downgrading to draft because I may find other stuff as I implement this and will tack on commits to this PR.

@tgeoghegan tgeoghegan force-pushed the timg/ping-pong-fixes branch from 6881879 to ac50c50 Compare August 18, 2023 17:29
@tgeoghegan tgeoghegan changed the title ping_pong_transition always returns a message Ping-pong topology fixes Aug 18, 2023
@tgeoghegan tgeoghegan marked this pull request as ready for review August 18, 2023 17:31
@tgeoghegan
Copy link
Contributor Author

This is ready for review now and corresponds to what I have implemented in divviup/libprio-rs#683. There are further changes I'd like to suggest to ping-pong, but more immediately, what I think we should do is review and merge this change, and then make a corresponding change to DAP that adapts to these new definitions. Then we can release DAP-06 and VDAF-07 and have published draft versions we can refer to in implementations. Further refinements to ping-pong can wait for subsequent drafts of the pair of documents.

@tgeoghegan
Copy link
Contributor Author

I think there's a problem with how aggregation parameters are handled in ping-pong. We have:

def ping_pong_leader_init(
            Vdaf,
            vdaf_verify_key: bytes[Vdaf.VERIFY_KEY_SIZE],
            agg_param: Vdaf.AggParam,
            nonce: bytes[Vdaf.NONCE_SIZE],
            public_share: bytes,
            input_share: bytes,
        ) -> tuple[State, Message]

But at the DAP layer, we always represent aggregation parameters as opaque byte buffers. For example:

struct {
  opaque agg_param<0..2^32-1>;
  PartialBatchSelector part_batch_selector;
  PrepareInit prepare_inits<1..2^32-1>;
} AggregationJobInitReq;

I added a commit here that resolves this by aligning ping-pong's handling of agg param with its handling of things like the public share or input shares: the arguments are now bytes and VDAFs are required to define encode/decode routines for their agg param. However we could also rewrite these interfaces to use the expressive types, and then make it DAP's responsibility to call encode/decode. I think VDAF has to require encode/decode routines no matter what, though.

Copy link
Collaborator

@cjpatton cjpatton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Just some editorial things.

draft-irtf-cfrg-vdaf.md Show resolved Hide resolved
draft-irtf-cfrg-vdaf.md Outdated Show resolved Hide resolved
draft-irtf-cfrg-vdaf.md Outdated Show resolved Hide resolved
draft-irtf-cfrg-vdaf.md Outdated Show resolved Hide resolved
@tgeoghegan
Copy link
Contributor Author

Corresponding DAP changes: ietf-wg-ppm/draft-ietf-ppm-dap#494

tgeoghegan pushed a commit to divviup/janus that referenced this pull request Aug 27, 2023
This change implements the DAP-05 ping-pong topology in which
aggregators take turns preprocessing prepare shares into prepare
messages. While this topology first appeared in DAP-05, this
implementation follows the changes in [1], which should appear in
DAP-06.

This change depends on the implementation of the VDAF ping-pong topology
added to crate `prio` in [2], which in turn conforms to the
specification added after VDAF-06 and further tweaked in [3] (we expect
this to be published soon as VDAF-07).

This commit makes some changes to what intermediate values are stored by
aggregators. In the case where an aggregator is continuing, it will have
computed a prepare state, a prepare message for the current round and a
prepare share for the next round. The existing implementation would
store all three objects in the database, significantly increasing the
per-report storage requirements. In particular, this makes things worse
for the Helper, which previously never needed to store a prepare share
because the Leader always took responsibility for combining prepare
shares.

To mitigate this, we instead have aggregators store a
`prio::ping_pong::topology::Transition`, which will contain a prepare
state and a prepare message (both of which are generally much smaller
than prepare shares), from which the next prepare state and importantly
prepare share can be recomputed.

The main benefit of this change is to reduce how many round trips
between aggregators are needed to prepare reports. Quite a few tests
used Prio3 but depended on having the leader or helper in the `Waiting`
state after running aggregation initialization. Accordingly, those tests
are changed to run Poplar1, which now takes 2 rounds.

[1]: ietf-wg-ppm/draft-ietf-ppm-dap#494
[2]: divviup/libprio-rs#683
[3]: cfrg/draft-irtf-cfrg-vdaf#281

Part of #1669
@cjpatton cjpatton self-requested a review August 29, 2023 17:21
Copy link
Collaborator

@cjpatton cjpatton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Let's way to merge until we've checked the corresponding changes in the VDAF spec. I also want to look at libprio's implementation.

Comment on lines +1257 to +1261
Note that there is no representation of the `Start` state as it is never
instantiated in the ping-pong topology.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems confusing, as we just said "the initial state is Start". It is true that the Start state is ephemeral, and doesn't have to be stored anywhere, but I think it would be clearer to just have class Start(State): pass declared and move on. We're just defining which fields are held by which states here, and don't provide advice on their serialization.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You and Chris both think it makes more sense, and you're the editors, so I'll add it.

* `Vdaf.decode_agg_param(encoded: bytes) -> Vdaf.AggParam` decodes an
aggregation parameter.

* `Vdaf.encode_agg_param(agg_param: Vdaf.AggParam) -> bytes` encodes an
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should adjust the definition of encode_agg_param for Poplar1 in 8.2.6.6 to match this, so it takes in the aggregation parameter as a single tuple argument (rather than two separate arguments).

@cjpatton
Copy link
Collaborator

@tgeoghegan can you squash and summarize the changes in the commit message?

A handful of fixes to the specification of the ping-pong aggregator
communication topology encountered during implementation.

 - Explicitly describe Python objects for `State`
 - Describe VDAF methods for encoding and decoding aggregation param
 - `ping_pong` methods accept aggregation param as encoded bytes
 - `ping_pong` methods accept ping pong Messages as encoded bytes
 - `ping_pong` methods must encode `Message` into bytes before returning
   it
 - `ping_pong_continued` fails if `state` is not `Continued`
 - add distinct `pong_pong_leader_continued` and
   `ping_pong_helper_continued` functions instead of `is_leader` param
 - fix typo in `ping_pong_transition`
 - Poplar1's `encode_agg_param` aligns with `Vdaf.encode_agg_param`
@tgeoghegan tgeoghegan force-pushed the timg/ping-pong-fixes branch from ade3d18 to b8cfd93 Compare August 30, 2023 17:41
@cjpatton cjpatton merged commit 87eb78c into cfrg:main Aug 30, 2023
tgeoghegan pushed a commit to divviup/janus that referenced this pull request Aug 31, 2023
This change implements the DAP-05 ping-pong topology in which
aggregators take turns preprocessing prepare shares into prepare
messages. While this topology first appeared in DAP-05, this
implementation follows the changes in [1], which should appear in
DAP-06.

This change depends on the implementation of the VDAF ping-pong topology
added to crate `prio` in [2], which in turn conforms to the
specification added after VDAF-06 and further tweaked in [3] (we expect
this to be published soon as VDAF-07).

This commit makes some changes to what intermediate values are stored by
aggregators. In the case where an aggregator is continuing, it will have
computed a prepare state, a prepare message for the current round and a
prepare share for the next round. The existing implementation would
store all three objects in the database, significantly increasing the
per-report storage requirements. In particular, this makes things worse
for the Helper, which previously never needed to store a prepare share
because the Leader always took responsibility for combining prepare
shares.

To mitigate this, we instead have aggregators store a
`prio::ping_pong::topology::Transition`, which will contain a prepare
state and a prepare message (both of which are generally much smaller
than prepare shares), from which the next prepare state and importantly
prepare share can be recomputed.

The main benefit of this change is to reduce how many round trips
between aggregators are needed to prepare reports. Quite a few tests
used Prio3 but depended on having the leader or helper in the `Waiting`
state after running aggregation initialization. Accordingly, those tests
are changed to run Poplar1, which now takes 2 rounds.

[1]: ietf-wg-ppm/draft-ietf-ppm-dap#494
[2]: divviup/libprio-rs#683
[3]: cfrg/draft-irtf-cfrg-vdaf#281

Part of #1669
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants