diff --git a/draft-wang-ppm-dap-taskprov.md b/draft-wang-ppm-dap-taskprov.md index d61f362..498253c 100644 --- a/draft-wang-ppm-dap-taskprov.md +++ b/draft-wang-ppm-dap-taskprov.md @@ -46,7 +46,11 @@ informative: --- abstract An extension for the Distributed Aggregation Protocol (DAP) is specified that -allows the task configuration to be provisioned in-band. +cryptographically binds the parameters of a task to the task's execution. In +particular, when a client includes this extension with its report, the servers +will only aggregate the report if all parties agree on the task parameters. +This document also specifies an optional mechanism for in-band task +provisioning that makes use of the report extension. --- middle @@ -58,48 +62,42 @@ of a set of reports submitted by Clients. This process is centered around a "task" that determines, among other things, the cryptographic scheme to use for the secure computation (a Verifiable Distributed Aggregation Function {{!VDAF=I-D.draft-irtf-cfrg-vdaf-08}}), how reports are partitioned into -batches, and privacy parameters such as the minimum size of each batch. Before a -task can be executed, it is necessary to first provision the Clients, -Aggregators, and Collector with the task's configuration. - -The core DAP specification does not define a mechanism for provisioning tasks. -This document describes a mechanism designed to fill this gap. Its key feature -is that task configuration is performed completely in-band, via HTTP request -headers. - -This method presumes the existence of a logical "task author" (written as -"Author" hereafter) who is capable of pushing configurations to Clients. All -parameters required by downstream entities (the Aggregators and Collector) are -encoded in an extension field of the Client's report. There is no need for -out-of-band task orchestration between Leader and Helpers, therefore making -adoption of DAP easier. - -The extension is designed with the same security and privacy considerations of -the core DAP protocol. The Author is not regarded as a trusted third party: It -is incumbent on all protocol participants to verify the task configuration -disseminated by the Author and opt-out if the parameters are deemed insufficient -for privacy. In particular, adopters of this extension should presume the -Author is under the adversary's control. In fact, we expect in a real-world -deployment that the Author may be implemented by one of the Aggregators or -Collector. - -Finally, the DAP protocol requires configuring the entities with a variety of -assets that are not task-specific, but are important for establishing -Client-Aggregator, Collector-Aggregator, and Aggregator-Aggregator -relationships. These include: - -* The Collector's HPKE {{!RFC9180}} configuration used by the Aggregators to - encrypt aggregate shares. - -* Any assets required for authenticating HTTP requests. - -This document does not specify a mechanism for provisioning these assets; as in -the core DAP protocol; these are presumed to be configured out-of-band. - -Note that we consider the VDAF verification key {{!VDAF}}, used by the -Aggregators to aggregate reports, to be a task-specific asset. This document -specifies how to derive this key for a given task from a pre-shared secret, -which in turn is presumed to be configured out-of-band. +batches, and privacy parameters such as the minimum size of each batch. See +{{Section 4.2 of !DAP}} for a complete listing. + +In order to execute a task securely, it is required that all parties agree on +all parameters associated with the task. However, the core DAP specification +does not specify a mechanism for accomplishing this. In particular, it is +possible that the parties successfully aggregate and collect a batch, but some +party does not know the parameters that were enforced. + +A desirable property for DAP to guarantee is that successful execution implies +agreement on the task parameters. On the other hand, disagreement between a +Client and the Aggregators should prevent reports uploaded by that Client from +being processed. + +{{definition}} specifies a report extension ({{Section 4.4.3 of !DAP}}) that +endows DAP with this property. First, it specifies an encoding of all task +parameters that are relevant to all parties. This excludes cryptographic +assets, such as the secret VDAF verification key ({{Section 5 of !VDAF}}) or +the public HPKE configurations {{!RFC9180}} of the aggregators or collector. +Second, the task ID is computed by hashing the encoded parameters. If a report +includes the extension, then each aggregator checks if the task ID was computed +properly: if not, it rejects the report. This cryptographic binding of the task +to its parameters ensures that the report is only processed if the client and +aggregator agree on the task parameters. + +One reason this task-binding property is desirable is that it makes the process +by which parties are provisioned with task parameters more robust. This is +because misconfiguration of a party would manifest in a server's telemetry as +report rejection. This is preferable to failing silently, as misconfiguration +could result in privacy loss. + +{{taskprov}} specifies one possible mechanism for provisioning DAP tasks that +is built on top of the extension in {{definition}}. Its chief design goal is to +make task configuration completely in-band, via HTTP request headers. Note that +this mechanism is an optional feature of this specification; it is not required +to implement the protocol extension in {{definition}}. # Conventions and Definitions @@ -120,28 +118,67 @@ Task configuration: : The non-secret parameters of a task. Task author: -: The entity that defines a task's configuration. +: The entity that defines a task's configuration in the provisioning mechanism of {{taskprov}}. # The Taskprov Extension {#definition} -The process of provisioning a task begins when the Author disseminates the task -configuration to the Collector and each of the Clients. When a Client issues an -upload request to the Leader (as described in {{Section 4.3 of !DAP}}), it -includes in an HTTP header the task configuration it used to generate the -report. We refer to this process as "task advertisement". Before consuming the -report, the Leader parses the configuration and decides whether to opt-in; if -not, the task's execution halts. +To use Taskprov extension, the Client includes the following extension in the +report extensions for each Aggregator as described in {{Section 4.4.3 of !DAP}}: -Otherwise, if the Leader does opt-in, it advertises the task to the Helpers -during the aggregation protocol ({{Section 4.4 of !DAP}}). In particular, it -includes the task configuration in an HTTP header of each aggregation job -request for that task. Before proceeding, the Helper must first parse the -configuration and decide whether to opt-in; if not, the task's execution halts. +[RFC EDITOR: Change this to the IANA-assigned codepoint.] -To advertise a task to its peer, a Taskprov participant includes a header -"dap-taskprov" with a request incident to the task execution. The value is the -`TaskConfig` structure defined below, expanded into its URL-safe, unpadded Base -64 representation as specified in {{Sections 5 and 3.2 of !RFC4648}}. +~~~ +enum { + taskprov(0xff00), + (65535) +} ExtensionType; +~~~ + +The payload of the extension MUST be empty. If the payload is non-empty, then +the Aggregator MUST reject the report. + +When the client uses the Taskprov extension, it computes the task ID ({{Section +4.2 of !DAP}}) as follows: + +~~~ +task_id = SHA-256(task_config) +~~~ + +where `task_config` is a `TaskConfig` structure defined in {{task-encoding}}. +Function SHA-256() is as defined in {{SHS}}. + +The task ID is bound to each report share (via HPKE authenticated and +associated data, see {{Section 4.4.2 of !DAP}}). Binding the parameters to the +ID this way ensures, in turn, that the report is only aggregated if the Client +and Aggregator agree on the parameters. This is accomplished by the Aggregator +behavior below. + +During aggregation ({{Section 4.5 of !DAP}}), each Aggregator processes a +report with the Taskprov extension as follows. + +First, it looks up the ID and parameters associated with the task. Note the +task has already been configured. Otherwise the Aggregator would have already +aborted the request due to not recognizing the task. + +Next, the Aggregator encodes the parameters as a `TaskConfig` defined in +{{task-encoding}} and computes the task ID as above. If the derived task ID +does not match the task ID of the request, then it MUST reject the report with +error + +[RFC EDITOR: Change this to the IANA-assigned codepoint.] + +enum { + unrecognized_task(254), + (255) +} PrepareError; + +During the upload flow ({{Section 4.4 of !DAP}}), the Leader MAY abort the +request with "unrecognizedTask" if the derived task ID does not match the task +ID Of the request. + +## Task Encoding + +The task configuration is encoded as follows: ~~~ struct { @@ -168,11 +205,13 @@ struct { ~~~ The purpose of `TaskConfig` is to define all parameters that are necessary for -configuring an Aggregator. It includes all the fields to be associated with a +configuring each party. It includes all the fields to be associated with a task. In addition to the Aggregator endpoints, maximum batch query count, and task expiration, the structure includes an opaque `task_info` field that is specific to a deployment. For example, this can be a string describing the -purpose of this task. +purpose of this task. It does not include cryptographic assets shared by only a +subset of the parties, including the secret VDAF verification key {{!VDAF}} or +public HPKE configurations {{!RFC9180}}. The opaque `query_config` field defines the DAP query configuration used to guide batch selection. Its content is structured as follows: @@ -195,11 +234,10 @@ can be decoded even if an unrecognized variant is encountered (i.e., an unimplemented query type). The maximum batch size for `fixed_size` query is optional. If `query_type` is -`fixed_size` and `max_batch_size` is 0, Aggregator should provision the task -without maximum batch size limit. Which means during batch validation -({{Section 4.6.5.2.2 of !DAP}}), Aggregator does not check -`len(X) <= max_batch_size`, where `X` is the set of reports successfully -aggregated into the batch. +`fixed_size` and `max_batch_size` is 0, then the task does not have maximum +batch size limit. In particular, during batch validation ({{Section 4.6.5.2.2 +of !DAP}}), the Aggregator does not check `len(X) <= max_batch_size`, where `X` +is the set of reports successfully aggregated into the batch. The `vdaf_config` defines the configuration of the VDAF in use for this task. Its content is as follows (codepoints are as defined in {{!VDAF}}): @@ -246,6 +284,7 @@ differential privacy (DP). The opaque `dp_config` contains the following structu enum { reserved(0), /* Reserved for testing purposes */ none(1), + aggregator_discrete_gaussian(5), (255) } DpMechanism; @@ -253,6 +292,9 @@ struct { DpMechanism dp_mechanism; select (DpConfig.dp_mechanism) { case none: Empty; + case aggregator_discrete_gaussian: + RealNumber sigma; + RealNumber sensititivity; }; } DpConfig; ~~~ @@ -268,16 +310,73 @@ unimplemented DP mechanism). The definition of `Time`, `Duration`, `Url`, and `QueryType` follow those in {{!DAP}}. -## Deriving the Task ID {#construct-task-id} -When using the Taskprov extension, the task ID is computed as follows: +# In-band Task Provisioning with the Taskprov Extension {#taskprov} -~~~ -task_id = SHA-256(task_config) -~~~ +XXX + +Before a +task can be executed, it is necessary to first provision the Clients, +Aggregators, and Collector with the task's configuration. + +The core DAP specification does not define a mechanism for provisioning tasks. +This document describes a mechanism designed to fill this gap. Its key feature +is that task configuration is performed completely in-band, via HTTP request +headers. + +This method presumes the existence of a logical "task author" (written as +"Author" hereafter) who is capable of pushing configurations to Clients. All +parameters required by downstream entities (the Aggregators and Collector) are +encoded in an extension field of the Client's report. There is no need for +out-of-band task orchestration between Leader and Helpers, therefore making +adoption of DAP easier. + +The extension is designed with the same security and privacy considerations of +the core DAP protocol. The Author is not regarded as a trusted third party: It +is incumbent on all protocol participants to verify the task configuration +disseminated by the Author and opt-out if the parameters are deemed insufficient +for privacy. In particular, adopters of this extension should presume the +Author is under the adversary's control. In fact, we expect in a real-world +deployment that the Author may be implemented by one of the Aggregators or +Collector. + +Finally, the DAP protocol requires configuring the entities with a variety of +assets that are not task-specific, but are important for establishing +Client-Aggregator, Collector-Aggregator, and Aggregator-Aggregator +relationships. These include: + +* The Collector's HPKE {{!RFC9180}} configuration used by the Aggregators to + encrypt aggregate shares. + +* Any assets required for authenticating HTTP requests. + +This document does not specify a mechanism for provisioning these assets; as in +the core DAP protocol; these are presumed to be configured out-of-band. + +Note that we consider the VDAF verification key {{!VDAF}}, used by the +Aggregators to aggregate reports, to be a task-specific asset. This document +specifies how to derive this key for a given task from a pre-shared secret, +which in turn is presumed to be configured out-of-band. + +The process of provisioning a task begins when the Author disseminates the task +configuration to the Collector and each of the Clients. When a Client issues an +upload request to the Leader (as described in {{Section 4.3 of !DAP}}), it +includes in an HTTP header the task configuration it used to generate the +report. We refer to this process as "task advertisement". Before consuming the +report, the Leader parses the configuration and decides whether to opt-in; if +not, the task's execution halts. + +Otherwise, if the Leader does opt-in, it advertises the task to the Helpers +during the aggregation protocol ({{Section 4.4 of !DAP}}). In particular, it +includes the task configuration in an HTTP header of each aggregation job +request for that task. Before proceeding, the Helper must first parse the +configuration and decide whether to opt-in; if not, the task's execution halts. + +To advertise a task to its peer, a Taskprov participant includes a header +"dap-taskprov" with a request incident to the task execution. The value is the +`TaskConfig` structure defined below, expanded into its URL-safe, unpadded Base +64 representation as specified in {{Sections 5 and 3.2 of !RFC4648}}. -where `task_config` is the `TaskConfig` structure disseminated by the Author. -Function SHA-256() is as defined in {{SHS}}. ## Deriving the VDAF Verification Key {#vdaf-verify-key} @@ -306,7 +405,7 @@ verify_key = HKDF-Expand( ~~~ where `taskprov_salt` is defined to be the SHA-256 hash of the octet string -"dap-taskprov" and `task_id` is as defined in {{construct-task-id}}. Functions +"dap-taskprov" and `task_id` is as defined in {{definition}}. Functions HKDF-Extract() and HKDF-Expand() are as defined in {{!RFC5869}}. Both functions are instantiated with SHA-256. @@ -315,7 +414,7 @@ are instantiated with SHA-256. Prior to participating in a task, each protocol participant must determine if the `TaskConfig` disseminated by the Author can be configured. The participant is said to "opt in" to the task if the derived task ID (see -{{construct-task-id}}) corresponds to an already configured task or the task ID +{{definition}}) corresponds to an already configured task or the task ID is unrecognized and therefore corresponds to a new task. A protocol participant MAY "opt out" of a task if: @@ -343,7 +442,7 @@ In DAP, Clients need to know the HPKE configuration of each Aggregator before sending reports. (See HPKE Configuration Request in {{!DAP}}.) However, in a DAP deployment that supports the Taskprov extension, if a Client requests the Aggregator's HPKE configuration with the task ID computed as described in -{{construct-task-id}}, the task ID may not be configured in the Aggregator yet, +{{definition}}, the task ID may not be configured in the Aggregator yet, because the Aggregator is still waiting for the task to be advertised by a Client. @@ -369,7 +468,7 @@ Once the client opts into a task, it may begin uploading reports for the task. Each upload request for that task MUST advertise the task configuration. The extension codepoint `taskprov` MUST be offered in the `extensions` field of both Leader and Helper's `PlaintextInputShare`. In addition, each report's task -ID MUST be computed as described in {{construct-task-id}}. +ID MUST be computed as described in {{definition}}. The `taskprov` extension type is defined as follows: @@ -399,7 +498,7 @@ the "dap-taskprov" HTTP header payload. If parsing fails, it MUST abort with Next, it checks that the task ID indicated by the upload request matches the task ID derived from the extension payload as specified in -{{construct-task-id}}. If the task ID does not match, then the Leader MUST abort +{{definition}}. If the task ID does not match, then the Leader MUST abort with "unrecognizedTask". The Leader then decides whether to opt in to the task as described in @@ -449,7 +548,7 @@ First, the Helper attempts to parse payload of the "dap-taskprov" HTTP header. If this step fails, the Helper MUST abort with "invalidMessage". Next, the Helper checks that the task ID indicated in the upload request matches -the task ID derived from the `TaskConfig` as defined in {{construct-task-id}}. +the task ID derived from the `TaskConfig` as defined in {{definition}}. If not, the Helper MUST abort with "unrecognizedTask". Next, the Helper decides whether to opt in to the task as described in @@ -480,6 +579,8 @@ If the Leader responds to a collect request with an "unrecognizedTask" error, the Collector MAY retry its collect request after waiting for a duration. header. + + # Security Considerations This document has the same security and privacy considerations as the core DAP