Explicitly forbid mutating agg jobs, collections #471

tgeoghegan · 2023-06-13T23:52:50Z

Resolves #406

branlwyd · 2023-06-14T16:06:25Z

draft-ietf-ppm-dap.md

@@ -1355,6 +1355,12 @@ On success, the Helper responds to the Leader with HTTP status code 201 Created
 and a body consisting of the `AggregationJobResp`, with media type
 "application/dap-aggregation-job-resp".

+Aggregation job initialization MUST be idempotent. That is, further requests to


A few things to consider including:

Mention that "further requests with a different body" would be grouped together by aggregation job ID (I suppose).

We don't want to allow re-initialization -- so if a continuation step has occurred, we should no longer allow initialization.

(& identical suggestions for the collection job case, with aggregation job ID -> collection job ID)

Mention that "further requests with a different body" would be grouped together by aggregation job ID (I suppose).

I think I know what you mean. Is this paragraph more clear?

Aggregation job initialization MUST be idempotent. That is, further requests to initialize some aggregation job whose body is identical to the first request MUST succeed. However, changing an aggregation job's parameters is illegal, so further requests to initialize an existing aggregation job with different parameters MUST fail with an HTTP client error status code.

We don't want to allow re-initialization -- so if a continuation step has occurred, we should no longer allow initialization.

So we're talking about a sequence of requests like this:

PUT /tasks/{task-id}/aggregation_jobs/{aggregation-job-id}

POST /tasks/{task-id}/aggregation_jobs/{aggregation-job-id}

PUT /tasks/{task-id}/aggregation_jobs/{aggregation-job-id}

When you say we should no longer allow initialization, do you mean that the request at (3) should fail with something like HTTP 400 Bad Request, or just that it shouldn't rewind or reset the state of the aggregation job? I think that the request at (3) should still get a 201 Created response, because in many cases, the leader can then follow up with a POST to continue the aggregation job and gracefully resume processing the job.

Mention that "further requests with a different body" would be grouped together by aggregation job ID (I suppose).

I'm suggesting that the text "further requests to initialize an existing aggregation job with different parameters" is ambiguous, because it does not define how to determine if a new request matches to an existing aggregation job. I think the right way to determine this is by aggregation job ID -- if the request includes an aggregation job ID matching to an existing aggregation job, it is a repeated request (possibly with different parameters); if the request includes a unique aggregation job ID, it is a request for a new aggregation job. My suggestion is that we state that explicitly.

When you say we should no longer allow initialization, do you mean that the request at (3) should fail with something like HTTP 400 Bad Request, or just that it shouldn't rewind or reset the state of the aggregation job?

Yes -- IMO request (3) should fail [and it shouldn't rewind/reset the state of the aggregation job]. I would suggest we don't allow re-initialization of existing aggregation jobs because we don't need to: if a further "continuation" request has been made, the Leader has definitely received the result of the prior "initialization" step. And if we allow aggregation jobs to be re-initialized, we will have to deal with "weird" issues like "what if the Leader drives an agg job all the way to completion, then re-initializes it, then drives it to completion once again -- should the output shares recovered by the second completion contribute to the eventual aggregate 'again'?" (probably not, but we'd need text to that effect, and implementations would have to add some complexity to deal with such issues)

Or, framing things in terms of idempotency, request (2) changes the state of the aggregation job, so afterwards re-sending request (1) has no expectation of idempotency, as it is a request to rewind the aggregation job -- we would at that point expect request (2) to be idempotent rather than request (1).

I'm suggesting that the text "further requests to initialize an existing aggregation job with different parameters" is ambiguous, because it does not define how to determine if a new request matches to an existing aggregation job.

OK, I'll rewrite to be more explicit.

Yes -- IMO request (3) should fail [and it shouldn't rewind/reset the state of the aggregation job].

I'm coming around to your point of view: reviewing the changes that will land in #393, it appears we don't allow for the AggregationJobResp a leader gets in response to initialization to ever contain PrepareResps in the finished state, so as it stands, we can't allow PUT /tasks/{task-id}/aggregation_jobs/{aggregation-job-id} to succeed regardless of the aggregation job's state. I was thinking that maybe an implementation could have multiple aggregation job workers racing to initialize and continue agg jobs, but the implementation would have to have a pretty serious state synchronization bug to have one worker step the job past round 1 and have the other still think the job needs to be initialized, so I don't think it's worth having the specification bend over backwards to accommodate this.

In short: requests to initialize an aggregation job that has been continued at least once are invalid.

draft-ietf-ppm-dap.md

wangshan · 2023-06-16T20:09:03Z

draft-ietf-ppm-dap.md

@@ -1871,6 +1884,14 @@ The Leader then begins working with the Helper to aggregate the reports
 satisfying the query (or continues this process, depending on the VDAF) as
 described in {{aggregate-flow}}.

+Collection job creation MUST be idempotent. That is, multiple requests to


does it matter if the task has expired and leader deleted it's collection_jobs?

I don't understand how task expiration complicates this. We already have this text:

Upon receipt of a CollectionReq, the Leader begins by checking that it
recognizes the task ID in the request path. If not, it MUST abort with error
unrecognizedTask.

So if task expires, then the leader should refuse to allow new collection jobs to be created in it, regardless of whether the request has been repeated.

wangshan · 2023-06-16T20:31:25Z

draft-ietf-ppm-dap.md

+Collection job creation MUST be idempotent. That is, multiple requests to
+`PUT /tasks/{tasks}/collection_jobs/{collection-job-id}` for the same
+`collection-job-id` and the same request body MUST succeed. However, changing
+a collection job's parameters is illegal, so further requests to


Does this conflict with max_batch_query_count? if a collection-job-id has been queried more than max_batch_query_count, the Leader is suppose to fail the CollectionReq with "batchQueriedTooManyTimes".

This does not conflict with max batch query count. Each of the queries against a given batch is a distinct collection with a distinct collection ID.

tgeoghegan · 2023-06-16T21:41:57Z

After thinking about this further, and having re-read what RFC 9110 already says about idempotence, safety and the different HTTP verbs, I think this change isn't necessary. We already explicitly reference RFC 9110 and it's not DAP's job to re-iterate HTTP concepts. Further, standards that build on HTTP like ACME get by without being so explicit about immutability and idempotence, so I don't think we need to go farther than that protocol does.

tgeoghegan · 2023-06-20T16:31:35Z

I've refocused the change to be about the DAP level semantics and to avoid restating HTTP semantics, which is confusing to readers. RFC 9110 among others does a better job of explaining HTTP and its methods than DAP can.

draft-ietf-ppm-dap.md

Note also that once an aggregation job has been continued, it may not be initialized again. Resolves #406

tgeoghegan requested review from divergentdave, branlwyd and cjpatton June 13, 2023 23:52

tgeoghegan requested review from ekr and chris-wood as code owners June 13, 2023 23:52

branlwyd approved these changes Jun 14, 2023

View reviewed changes

tgeoghegan added the draft-05 label Jun 14, 2023

tgeoghegan force-pushed the timg/mutating-the-immutable branch from 1df6370 to a354bfe Compare June 15, 2023 22:37

tgeoghegan requested a review from branlwyd June 15, 2023 22:37

cjpatton reviewed Jun 15, 2023

View reviewed changes

draft-ietf-ppm-dap.md Outdated Show resolved Hide resolved

cjpatton mentioned this pull request Jun 16, 2023

What error should servers provide if someone tries to mutate an immutable resource? #406

Closed

wangshan reviewed Jun 16, 2023

View reviewed changes

tgeoghegan requested a review from cjpatton June 20, 2023 16:31

cjpatton reviewed Jun 20, 2023

View reviewed changes

draft-ietf-ppm-dap.md Show resolved Hide resolved

draft-ietf-ppm-dap.md Show resolved Hide resolved

draft-ietf-ppm-dap.md Show resolved Hide resolved

draft-ietf-ppm-dap.md Show resolved Hide resolved

branlwyd approved these changes Jun 20, 2023

View reviewed changes

cjpatton approved these changes Jun 20, 2023

View reviewed changes

divergentdave approved these changes Jun 20, 2023

View reviewed changes

Explicitly forbid mutating agg jobs, collections

bf7c585

Note also that once an aggregation job has been continued, it may not be initialized again. Resolves #406

tgeoghegan force-pushed the timg/mutating-the-immutable branch from 37182c8 to bf7c585 Compare June 21, 2023 16:09

tgeoghegan merged commit ace9a83 into main Jun 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Explicitly forbid mutating agg jobs, collections #471

Explicitly forbid mutating agg jobs, collections #471

tgeoghegan commented Jun 13, 2023

branlwyd Jun 14, 2023

tgeoghegan Jun 14, 2023

branlwyd Jun 14, 2023

tgeoghegan Jun 15, 2023

wangshan Jun 16, 2023

cjpatton Jun 20, 2023

tgeoghegan Jun 20, 2023

wangshan Jun 16, 2023

cjpatton Jun 20, 2023

tgeoghegan Jun 20, 2023

tgeoghegan commented Jun 16, 2023 •

edited

Loading

tgeoghegan commented Jun 20, 2023

Explicitly forbid mutating agg jobs, collections #471

Explicitly forbid mutating agg jobs, collections #471

Conversation

tgeoghegan commented Jun 13, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tgeoghegan commented Jun 16, 2023 • edited Loading

tgeoghegan commented Jun 20, 2023

tgeoghegan commented Jun 16, 2023 •

edited

Loading