Designing and understanding the privacy properties of our algorithm #24

MxmUrw · 2023-01-23T16:28:48Z

There are two approaches for generating noise:

1. Sample from the "float" gaussian distribution

2. Sample from the discrete gaussian distribution

Papers:

Deep Learning with Differential Privacy. here
The Distributed Discrete Gaussian Mechanism for Federated Learning with Secure Aggregation. here

Questions:

Our distribution is not defined on all integers, but only on a subset. (as we need to prohibit overflow in the finite field that encodes our fixed point numbers) (Understand how being in a finite field instead of on the integers affects the distribution #22)
Paper 2 applies randomized rounding when converting the gradient vector into fixed-point representation. It seems like this is not done because of privacy guarantees. Simply rounding towards 0 should be enough for privacy, because this way the norm stays below 1, so the argument that the overall function is 1-sensitive applies. But is this true?

ooovi · 2023-01-31T20:52:33Z

this explains why approach 2. (Sample from the discrete gaussian distribution) results in a DP mechanism.

for all c clients:
   1. compute gradient and clip to L2-norm 1
   2. round towards 0 componentwise to get n-bit fixedpoint in (-1, 1)
   3. project to integers in (0, 2^n) using p(x) = 2^(n-1)*x + 2^(n-1) componentwise
4. sum all gradients

5. add discrete gaussian (0, sigma²) noise componentwise

6. modulo to obtain a field element 
7. project to float using pp(y) = 2^(1-n) * y - c

privacy

steps 1.-4. describe a function to the integers. as for each input gradient $\lVert x\rVert_2 \leq 1$ because of the clipping in step 1., exchanging any one clipped input gradient $x$ by any other $x'$ (and hence, replacing any one entry in any client's data set by a different entry) will change the result of the computation up to step 4. by at most

$$\left\lVert \sum_{y \in X} \bar p(y) - \sum_{y \in X'} \bar p(y)\right\rVert_2 = ||\bar p(x) - \bar p(x')||_2 \leq 2^{n-1} \cdot(||x||_2 + ||x'||_2) \leq 2^n$$

where $\bar p$ is the componentwise application of the projection p to integers described in the code, $X$ and $X'$ are the input gradient sets with $x$ and $x'$ exchanged and the other inputs the same. hence the sensitivity of steps 1.-4. is $2^n$. for any $\epsilon>0$ and $\sigma = \frac{2^n}{\epsilon}$, step 5. makes the whole procedure $\frac{1}{2} \epsilon^2$ zero-concentrated differentially private (https://arxiv.org/pdf/2004.00010.pdf theorem 14 with all $\sigma_j = \sigma$). hence to obtain $\rho$-zCDP, we should choose $\sigma = \frac{2^n}{\sqrt{2\rho}}$.

steps 6.-7. don't affect privacy due to post processing invariance of zCDP (https://arxiv.org/pdf/1605.02065.pdf lemma 1.8).

we need to make sure we're doing the rounding as described in step 2 (#28).

also, this is the privacy guarantee for one training step only. zCDP composes nicely (https://arxiv.org/pdf/1605.02065.pdf lemma 2.3), so performing k iterations of the above algorithm using the same input data sets for the clients will yield a k * rho zCDP algorithm.

federation

steps 1.-3. are done locally with the clients. the resulting integers are secret shared among the aggregators, which verify the clipping was done correctly using our vdaf. they perform steps 4.-7. on the ciphertext in a distributed manner, except for the noising in step 5., which is done by each aggregator using plaintext noise known only to that aggregator. each aggregator has to add the full amount of noise, because they are assumed to be honest but curious and could reconstruct a less-noised version of the result if the other aggregator did not add all the noise necessary for the DP guarantee. the field used for this part of the computation is chosen large enough to avoid modulo wraparound for the sum computation, which allows us to compute the sensitivity as if it were a function on the integers.

MxmUrw added the theory label Jan 31, 2023

ooovi mentioned this issue Feb 2, 2023

What are the correct parameters for the Discrete Gaussian w.r.t. our encoding s.t. we end up with the proper noise in float land? #23

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Designing and understanding the privacy properties of our algorithm #24

Designing and understanding the privacy properties of our algorithm #24

MxmUrw commented Jan 23, 2023 •

edited

Loading

ooovi commented Jan 31, 2023 •

edited

Loading

Designing and understanding the privacy properties of our algorithm #24

Designing and understanding the privacy properties of our algorithm #24

Comments

MxmUrw commented Jan 23, 2023 • edited Loading

1. Sample from the "float" gaussian distribution

2. Sample from the discrete gaussian distribution

ooovi commented Jan 31, 2023 • edited Loading

privacy

federation

MxmUrw commented Jan 23, 2023 •

edited

Loading

ooovi commented Jan 31, 2023 •

edited

Loading