Skip to content

Commit

Permalink
WIP
Browse files Browse the repository at this point in the history
  • Loading branch information
cjpatton committed Jun 12, 2024
1 parent 1ddcb35 commit 7838b84
Showing 1 changed file with 181 additions and 80 deletions.
261 changes: 181 additions & 80 deletions draft-wang-ppm-dap-taskprov.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,11 @@ informative:
--- abstract

An extension for the Distributed Aggregation Protocol (DAP) is specified that
allows the task configuration to be provisioned in-band.
cryptographically binds the parameters of a task to the task's execution. In
particular, when a client includes this extension with its report, the servers
will only aggregate the report if all parties agree on the task parameters.
This document also specifies an optional mechanism for in-band task
provisioning that makes use of the report extension.


--- middle
Expand All @@ -58,48 +62,42 @@ of a set of reports submitted by Clients. This process is centered around a
"task" that determines, among other things, the cryptographic scheme to use for
the secure computation (a Verifiable Distributed Aggregation Function
{{!VDAF=I-D.draft-irtf-cfrg-vdaf-08}}), how reports are partitioned into
batches, and privacy parameters such as the minimum size of each batch. Before a
task can be executed, it is necessary to first provision the Clients,
Aggregators, and Collector with the task's configuration.

The core DAP specification does not define a mechanism for provisioning tasks.
This document describes a mechanism designed to fill this gap. Its key feature
is that task configuration is performed completely in-band, via HTTP request
headers.

This method presumes the existence of a logical "task author" (written as
"Author" hereafter) who is capable of pushing configurations to Clients. All
parameters required by downstream entities (the Aggregators and Collector) are
encoded in an extension field of the Client's report. There is no need for
out-of-band task orchestration between Leader and Helpers, therefore making
adoption of DAP easier.

The extension is designed with the same security and privacy considerations of
the core DAP protocol. The Author is not regarded as a trusted third party: It
is incumbent on all protocol participants to verify the task configuration
disseminated by the Author and opt-out if the parameters are deemed insufficient
for privacy. In particular, adopters of this extension should presume the
Author is under the adversary's control. In fact, we expect in a real-world
deployment that the Author may be implemented by one of the Aggregators or
Collector.

Finally, the DAP protocol requires configuring the entities with a variety of
assets that are not task-specific, but are important for establishing
Client-Aggregator, Collector-Aggregator, and Aggregator-Aggregator
relationships. These include:

* The Collector's HPKE {{!RFC9180}} configuration used by the Aggregators to
encrypt aggregate shares.

* Any assets required for authenticating HTTP requests.

This document does not specify a mechanism for provisioning these assets; as in
the core DAP protocol; these are presumed to be configured out-of-band.

Note that we consider the VDAF verification key {{!VDAF}}, used by the
Aggregators to aggregate reports, to be a task-specific asset. This document
specifies how to derive this key for a given task from a pre-shared secret,
which in turn is presumed to be configured out-of-band.
batches, and privacy parameters such as the minimum size of each batch. See
{{Section 4.2 of !DAP}} for a complete listing.

In order to execute a task securely, it is required that all parties agree on
all parameters associated with the task. However, the core DAP specification
does not specify a mechanism for accomplishing this. In particular, it is
possible that the parties successfully aggregate and collect a batch, but some
party does not know the parameters that were enforced.

A desirable property for DAP to guarantee is that successful execution implies
agreement on the task parameters. On the other hand, disagreement between a
Client and the Aggregators should prevent reports uploaded by that Client from
being processed.

{{definition}} specifies a report extension ({{Section 4.4.3 of !DAP}}) that
endows DAP with this property. First, it specifies an encoding of all task
parameters that are relevant to all parties. This excludes cryptographic
assets, such as the secret VDAF verification key ({{Section 5 of !VDAF}}) or
the public HPKE configurations {{!RFC9180}} of the aggregators or collector.
Second, the task ID is computed by hashing the encoded parameters. If a report
includes the extension, then each aggregator checks if the task ID was computed
properly: if not, it rejects the report. This cryptographic binding of the task
to its parameters ensures that the report is only processed if the client and
aggregator agree on the task parameters.

One reason this task-binding property is desirable is that it makes the process
by which parties are provisioned with task parameters more robust. This is
because misconfiguration of a party would manifest in a server's telemetry as
report rejection. This is preferable to failing silently, as misconfiguration
could result in privacy loss.

{{taskprov}} specifies one possible mechanism for provisioning DAP tasks that
is built on top of the extension in {{definition}}. Its chief design goal is to
make task configuration completely in-band, via HTTP request headers. Note that
this mechanism is an optional feature of this specification; it is not required
to implement the protocol extension in {{definition}}.

# Conventions and Definitions

Expand All @@ -120,28 +118,67 @@ Task configuration:
: The non-secret parameters of a task.

Task author:
: The entity that defines a task's configuration.
: The entity that defines a task's configuration in the provisioning mechanism of {{taskprov}}.

# The Taskprov Extension {#definition}

The process of provisioning a task begins when the Author disseminates the task
configuration to the Collector and each of the Clients. When a Client issues an
upload request to the Leader (as described in {{Section 4.3 of !DAP}}), it
includes in an HTTP header the task configuration it used to generate the
report. We refer to this process as "task advertisement". Before consuming the
report, the Leader parses the configuration and decides whether to opt-in; if
not, the task's execution halts.
To use Taskprov extension, the Client includes the following extension in the
report extensions for each Aggregator as described in {{Section 4.4.3 of !DAP}}:

Otherwise, if the Leader does opt-in, it advertises the task to the Helpers
during the aggregation protocol ({{Section 4.4 of !DAP}}). In particular, it
includes the task configuration in an HTTP header of each aggregation job
request for that task. Before proceeding, the Helper must first parse the
configuration and decide whether to opt-in; if not, the task's execution halts.
[RFC EDITOR: Change this to the IANA-assigned codepoint.]

To advertise a task to its peer, a Taskprov participant includes a header
"dap-taskprov" with a request incident to the task execution. The value is the
`TaskConfig` structure defined below, expanded into its URL-safe, unpadded Base
64 representation as specified in {{Sections 5 and 3.2 of !RFC4648}}.
~~~
enum {
taskprov(0xff00),
(65535)
} ExtensionType;
~~~

The payload of the extension MUST be empty. If the payload is non-empty, then
the Aggregator MUST reject the report.

When the client uses the Taskprov extension, it computes the task ID ({{Section
4.2 of !DAP}}) as follows:

~~~
task_id = SHA-256(task_config)
~~~

where `task_config` is a `TaskConfig` structure defined in {{task-encoding}}.
Function SHA-256() is as defined in {{SHS}}.

The task ID is bound to each report share (via HPKE authenticated and
associated data, see {{Section 4.4.2 of !DAP}}). Binding the parameters to the
ID this way ensures, in turn, that the report is only aggregated if the Client
and Aggregator agree on the parameters. This is accomplished by the Aggregator
behavior below.

During aggregation ({{Section 4.5 of !DAP}}), each Aggregator processes a
report with the Taskprov extension as follows.

First, it looks up the ID and parameters associated with the task. Note the
task has already been configured. Otherwise the Aggregator would have already
aborted the request due to not recognizing the task.

Next, the Aggregator encodes the parameters as a `TaskConfig` defined in
{{task-encoding}} and computes the task ID as above. If the derived task ID
does not match the task ID of the request, then it MUST reject the report with
error

[RFC EDITOR: Change this to the IANA-assigned codepoint.]

enum {
unrecognized_task(254),
(255)
} PrepareError;

During the upload flow ({{Section 4.4 of !DAP}}), the Leader MAY abort the
request with "unrecognizedTask" if the derived task ID does not match the task
ID Of the request.

## Task Encoding

The task configuration is encoded as follows:

~~~
struct {
Expand All @@ -168,11 +205,13 @@ struct {
~~~

The purpose of `TaskConfig` is to define all parameters that are necessary for
configuring an Aggregator. It includes all the fields to be associated with a
configuring each party. It includes all the fields to be associated with a
task. In addition to the Aggregator endpoints, maximum batch query count, and
task expiration, the structure includes an opaque `task_info` field that is
specific to a deployment. For example, this can be a string describing the
purpose of this task.
purpose of this task. It does not include cryptographic assets shared by only a
subset of the parties, including the secret VDAF verification key {{!VDAF}} or
public HPKE configurations {{!RFC9180}}.

The opaque `query_config` field defines the DAP query configuration used to
guide batch selection. Its content is structured as follows:
Expand All @@ -195,11 +234,10 @@ can be decoded even if an unrecognized variant is encountered (i.e., an
unimplemented query type).

The maximum batch size for `fixed_size` query is optional. If `query_type` is
`fixed_size` and `max_batch_size` is 0, Aggregator should provision the task
without maximum batch size limit. Which means during batch validation
({{Section 4.6.5.2.2 of !DAP}}), Aggregator does not check
`len(X) <= max_batch_size`, where `X` is the set of reports successfully
aggregated into the batch.
`fixed_size` and `max_batch_size` is 0, then the task does not have maximum
batch size limit. In particular, during batch validation ({{Section 4.6.5.2.2
of !DAP}}), the Aggregator does not check `len(X) <= max_batch_size`, where `X`
is the set of reports successfully aggregated into the batch.

The `vdaf_config` defines the configuration of the VDAF in use for this task.
Its content is as follows (codepoints are as defined in {{!VDAF}}):
Expand Down Expand Up @@ -246,13 +284,17 @@ differential privacy (DP). The opaque `dp_config` contains the following structu
enum {
reserved(0), /* Reserved for testing purposes */
none(1),
aggregator_discrete_gaussian(5),
(255)
} DpMechanism;

struct {
DpMechanism dp_mechanism;
select (DpConfig.dp_mechanism) {
case none: Empty;
case aggregator_discrete_gaussian:
RealNumber sigma;
RealNumber sensititivity;
};
} DpConfig;
~~~
Expand All @@ -268,16 +310,73 @@ unimplemented DP mechanism).
The definition of `Time`, `Duration`, `Url`, and `QueryType` follow those in
{{!DAP}}.

## Deriving the Task ID {#construct-task-id}

When using the Taskprov extension, the task ID is computed as follows:
# In-band Task Provisioning with the Taskprov Extension {#taskprov}

~~~
task_id = SHA-256(task_config)
~~~
XXX

Before a
task can be executed, it is necessary to first provision the Clients,
Aggregators, and Collector with the task's configuration.

The core DAP specification does not define a mechanism for provisioning tasks.
This document describes a mechanism designed to fill this gap. Its key feature
is that task configuration is performed completely in-band, via HTTP request
headers.

This method presumes the existence of a logical "task author" (written as
"Author" hereafter) who is capable of pushing configurations to Clients. All
parameters required by downstream entities (the Aggregators and Collector) are
encoded in an extension field of the Client's report. There is no need for
out-of-band task orchestration between Leader and Helpers, therefore making
adoption of DAP easier.

The extension is designed with the same security and privacy considerations of
the core DAP protocol. The Author is not regarded as a trusted third party: It
is incumbent on all protocol participants to verify the task configuration
disseminated by the Author and opt-out if the parameters are deemed insufficient
for privacy. In particular, adopters of this extension should presume the
Author is under the adversary's control. In fact, we expect in a real-world
deployment that the Author may be implemented by one of the Aggregators or
Collector.

Finally, the DAP protocol requires configuring the entities with a variety of
assets that are not task-specific, but are important for establishing
Client-Aggregator, Collector-Aggregator, and Aggregator-Aggregator
relationships. These include:

* The Collector's HPKE {{!RFC9180}} configuration used by the Aggregators to
encrypt aggregate shares.

* Any assets required for authenticating HTTP requests.

This document does not specify a mechanism for provisioning these assets; as in
the core DAP protocol; these are presumed to be configured out-of-band.

Note that we consider the VDAF verification key {{!VDAF}}, used by the
Aggregators to aggregate reports, to be a task-specific asset. This document
specifies how to derive this key for a given task from a pre-shared secret,
which in turn is presumed to be configured out-of-band.

The process of provisioning a task begins when the Author disseminates the task
configuration to the Collector and each of the Clients. When a Client issues an
upload request to the Leader (as described in {{Section 4.3 of !DAP}}), it
includes in an HTTP header the task configuration it used to generate the
report. We refer to this process as "task advertisement". Before consuming the
report, the Leader parses the configuration and decides whether to opt-in; if
not, the task's execution halts.

Otherwise, if the Leader does opt-in, it advertises the task to the Helpers
during the aggregation protocol ({{Section 4.4 of !DAP}}). In particular, it
includes the task configuration in an HTTP header of each aggregation job
request for that task. Before proceeding, the Helper must first parse the
configuration and decide whether to opt-in; if not, the task's execution halts.

To advertise a task to its peer, a Taskprov participant includes a header
"dap-taskprov" with a request incident to the task execution. The value is the
`TaskConfig` structure defined below, expanded into its URL-safe, unpadded Base
64 representation as specified in {{Sections 5 and 3.2 of !RFC4648}}.

where `task_config` is the `TaskConfig` structure disseminated by the Author.
Function SHA-256() is as defined in {{SHS}}.

## Deriving the VDAF Verification Key {#vdaf-verify-key}

Expand Down Expand Up @@ -306,7 +405,7 @@ verify_key = HKDF-Expand(
~~~

where `taskprov_salt` is defined to be the SHA-256 hash of the octet string
"dap-taskprov" and `task_id` is as defined in {{construct-task-id}}. Functions
"dap-taskprov" and `task_id` is as defined in {{definition}}. Functions
HKDF-Extract() and HKDF-Expand() are as defined in {{!RFC5869}}. Both functions
are instantiated with SHA-256.

Expand All @@ -315,7 +414,7 @@ are instantiated with SHA-256.
Prior to participating in a task, each protocol participant must determine if
the `TaskConfig` disseminated by the Author can be configured. The participant
is said to "opt in" to the task if the derived task ID (see
{{construct-task-id}}) corresponds to an already configured task or the task ID
{{definition}}) corresponds to an already configured task or the task ID
is unrecognized and therefore corresponds to a new task.

A protocol participant MAY "opt out" of a task if:
Expand Down Expand Up @@ -343,7 +442,7 @@ In DAP, Clients need to know the HPKE configuration of each Aggregator before
sending reports. (See HPKE Configuration Request in {{!DAP}}.) However, in a DAP
deployment that supports the Taskprov extension, if a Client requests the
Aggregator's HPKE configuration with the task ID computed as described in
{{construct-task-id}}, the task ID may not be configured in the Aggregator yet,
{{definition}}, the task ID may not be configured in the Aggregator yet,
because the Aggregator is still waiting for the task to be advertised by a
Client.

Expand All @@ -369,7 +468,7 @@ Once the client opts into a task, it may begin uploading reports for the task.
Each upload request for that task MUST advertise the task configuration. The
extension codepoint `taskprov` MUST be offered in the `extensions` field of
both Leader and Helper's `PlaintextInputShare`. In addition, each report's task
ID MUST be computed as described in {{construct-task-id}}.
ID MUST be computed as described in {{definition}}.

The `taskprov` extension type is defined as follows:

Expand Down Expand Up @@ -399,7 +498,7 @@ the "dap-taskprov" HTTP header payload. If parsing fails, it MUST abort with

Next, it checks that the task ID indicated by the upload request matches the
task ID derived from the extension payload as specified in
{{construct-task-id}}. If the task ID does not match, then the Leader MUST abort
{{definition}}. If the task ID does not match, then the Leader MUST abort
with "unrecognizedTask".

The Leader then decides whether to opt in to the task as described in
Expand Down Expand Up @@ -449,7 +548,7 @@ First, the Helper attempts to parse payload of the "dap-taskprov" HTTP header.
If this step fails, the Helper MUST abort with "invalidMessage".

Next, the Helper checks that the task ID indicated in the upload request matches
the task ID derived from the `TaskConfig` as defined in {{construct-task-id}}.
the task ID derived from the `TaskConfig` as defined in {{definition}}.
If not, the Helper MUST abort with "unrecognizedTask".

Next, the Helper decides whether to opt in to the task as described in
Expand Down Expand Up @@ -480,6 +579,8 @@ If the Leader responds to a collect request with an "unrecognizedTask" error,
the Collector MAY retry its collect request after waiting for a duration.
header.



# Security Considerations

This document has the same security and privacy considerations as the core DAP
Expand Down

0 comments on commit 7838b84

Please sign in to comment.