From 7fba2e7a58f6f7ec8a59b1e60ca4843d24ee940f Mon Sep 17 00:00:00 2001 From: Ivan Schasny Date: Wed, 31 May 2023 15:44:07 +0100 Subject: [PATCH 01/13] docs: IPIP for Delegated Routing Privacy Upgrade --- src/ipips/ipip-XXX.md | 79 +++++++++++++ src/routing/http-routing-reader-privacy-v1.md | 107 ++++++++++++++++++ 2 files changed, 186 insertions(+) create mode 100644 src/ipips/ipip-XXX.md create mode 100644 src/routing/http-routing-reader-privacy-v1.md diff --git a/src/ipips/ipip-XXX.md b/src/ipips/ipip-XXX.md new file mode 100644 index 000000000..094fced89 --- /dev/null +++ b/src/ipips/ipip-XXX.md @@ -0,0 +1,79 @@ +--- +title: "IPIP-XXX: HTTP Delegated Routing Reader Privacy Upgrade" +date: 2023-05-31 +ipip: ratified +editors: + - name: Andrew Gillis + github: gammazero + - name: Ivan Schasny + github: ischasny + - name: Masih Derkani + github: masih + - name: Will Scott + github: willscott +order: XXX +tags: ['ipips', 'routing', 'privacy', 'double hashing'] +--- + +## Summary + +This IPIP specifies new HTTP API for Privacy Preserving Delegated Content Routing provider lookups. + +## Motivation + +IPFS is currently lacking of many privacy protections. One of its main weak points lies in the lack +of privacy protections for the Content Routing subsystem. Currently neither Readers (clients accessing files) +nor Writers (hosts storing and distributing content) have much privacy with regard to content they publish or +consume. It is very easy for a Content Router or a Passive Observer to learn which file is requested by +which client during the routing process, as the potential adversary easily learns about the requested `CID`. +A curious actor could request the same `CID` and download the associated file to monitor the user’s behavior. +This is obviously undesirable and has been for some time now a strong request from the community. + +The latest upgrades to the DHT and IPNI have introduced Double Hashing - a technique that aims to better preserve Reader Privacy. +With Double Hashing in place Provider Records are encrypted and opaque to Content Routers. If presented with the original `CID` a +Content Router can decrypt the relevant Provider Records and serve them via the existing Delegated Routing API. +However in order to benefit from the privacy enhancement users need to change the way they interact with Content Routers, in particular: +- A second hash over the original `Multihash` must be used when looking up the content; +- Returned Provider Records are encrypted and must be decrypted by the client before using them; +- The client might choose to fetch additional encrypted Metadata from the Content Router. + +This new way of interaction can not be fullfilled by the existing API. This IPIP is an incremental improvement to the HTTP Delegated Routing API that adds +new endpoints for serving encrypted content. The original API can still be used for not Privacy Preserving lookups. + +Writer Privacy is out of scope of this IPIP and is going to be addressed separately. + +## Detailed design + +See the Delegated Routing Reader Privacy Upgrade spec (:cite[http-routing-reader-privacy-v1]) included with this IPIP. + +## Design rationale + +This API proposal makes the following changes: +- Adds new methods for looking up encrypted Provider Records and encrypted Metadata; +- Defines Hashing and Encryption functions and response payloads structure. + +There are no ideomatic changes to the API - all data formats, design rationale and principles outlined in the original [HTTP Delegated Routing IPIP](./ipip-0337.md) apply here. + +### User benefit + +With the new APIs users can protect themselves from: +- a malicious actor spying on the user by observing the user to Content Router traffic and then downloading the same data; +- the new API is a first step towards fully private HTTP Delegated Routing protocol that will eliminate IPNI as centralised observers. + +There are no other functional improvements. + +### Compatibility + +#### Backwards Compatibility + +The new API will be implemented in [go-delegated-routing](https://github.com/ipfs/boxo/tree/main/routing/http) and will not introduce any breaking changes. +The API will be released in a new minor version. + +### Resources + +- [IPIP-272 (double hashed DHT)](https://github.com/ipfs/specs/pull/373/) +- [ipni#5 (reader privacy in indexers)](https://github.com/ipni/specs/pull/5) + +### Copyright + +Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/). diff --git a/src/routing/http-routing-reader-privacy-v1.md b/src/routing/http-routing-reader-privacy-v1.md new file mode 100644 index 000000000..c2b79a064 --- /dev/null +++ b/src/routing/http-routing-reader-privacy-v1.md @@ -0,0 +1,107 @@ +--- +title: Routing V1 HTTP Delegated Routing Reader Privacy Upgrade +description: > + This specification describes Delegated Routing Reader Privacy Upgrade. It's an + incremental improvement to HTTP Delegated Routing API and inherits all of its + formats and design rationale. +date: 2023-05-31 +maturity: reliable +editors: + - name: Andrew Gillis + github: gammazero + - name: Ivan Schasny + github: ischasny + - name: Masih Derkani + github: masih + - name: Will Scott + github: willscott +order: 0 +tags: ['routing', 'double hashing', 'privacy'] +--- + +This specification describes a new HTTP API for Privacy Preserving Delegated Content Routing provider lookups. It's an extension to HTTP Delegated Routing API and inherits all of its formats and design rationale. + +## API Specification + +### Magic Values + + All salts below are 64-bytes long, and represent a string padded with `\x00`. + + - `SALT_DOUBLEHASH = bytes("CR_DOUBLEHASH\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00")` + - `SALT_ENCRYPTIONKEY = bytes("CR_ENCRYPTIONKEY\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00")` + - `SALT_NONCE = bytes("CR_NONCE\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00")` + +### Glossary + +- **`enc`** is [AESGCM](https://en.wikipedia.org/wiki/Galois/Counter_Mode) encryption. The following notation will be used for the rest of the specification `enc(passphrase, nonce, payload)`. +- **`hash`** is [SHA256](https://en.wikipedia.org/wiki/SHA-2) hashing. +- **`||`** is concatenation of two values. +- **`deriveKey`** is deriving a 32-byte encryption key from a passphrase that is done as `hash(SALT_ENCRYPTIONKEY || passphrase)`. +- **`Nonce`** is a 12-byte nonce used as Initialization Vector (IV) for the AESGCM encryption. IPNI expects an explicit instruction to delete a record (comparing to the DHT where records expire). +Hence the IPNI server needs to be able to compare encrypted values without having to decrypt them as that would require a key that it is unaware of. +That means that the nonce has to be deterministically chosen so that `enc(passphrase, nonce, payload)` produces the same output for the same +`passpharase` + `payload` pair. Nonce must be calculated as `hash(SALT_NONCE || passphrase || len(payload) || payload)[:12]`, where `len(payload)` is +an 8-byte length of the `payload` encoded in Little Endian format. Choice of nonce is not enforced by the IPNI specification. The described approach will +be used while IPNI encrypts Advertisements on behaf of Publishers. However once Writer Privacy is implemented, the choice of nonce will be left up to the Publisher. +- **`CID`** is the [Content IDentifier](https://github.com/multiformats/cid). +- **`MH`** is the [Multihash](https://github.com/multiformats/multihash) contained in a `CID`. It corresponds to the +digest of a hash function over some content. `MH` is represented as a 32-byte array. +- **`HASH2`** is a second hash over the multihash. Second Hashes must be of `Multihash` format with `DBL_SHA_256` codec. +The digest must be calculated as `hash(SALT_DOUBLEHASH || MH)`. +- **`ProviderRecord`** is a Provider Record as described in the [HTTP Delegated Routing Specification](http-routing-v1.md). +- **`ProviderRecordKey`** is a concatentation of `peerID || contextID`. There is no need for explicitly encoding lengths as they are +already encoded as a part of the multihash format. +- **`EncProviderRecordKey`** is `Nonce || enc(deriveKey(multihash), Nonce, ProviderRecordKey)`. +- **`HashProviderRecordKey`** is a hash over `ProviderRecordKey` that must be calculated as `hash(SALT_DOUBLEHASH || ProviderRecordKey)`. +- **`Metadata`** is free form bytes that can represent such information such as IPNI metadata. +- **`EncMetadata`** is `Nonce || enc(deriveKey(ProviderRecordKey), Nonce, Metadata)`. + +### API + +Assembling a full `ProviderRecord` from the encrypted data will require multiple roundtrips to the server. The first one to fetch a list of `EncProviderRecordKey`s and then one per +`EncProviderRecordKey` to fetch `EncMetadata`. In order to reduce the number of roundtrips to one the client implementation should use the local libp2p peerstore for multiaddress discovery +and [libp2p multistream select](https://github.com/multiformats/multistream-select) for protocol negotiation. + +#### `GET /routing/v1/encrypted/providers/{HASH2}` + +##### Response codes + +- `200` (OK): the response body contains 0 or more records +- `404` (Not Found): must be returned if no matching records are found +- `422` (Unprocessable Entity): request does not conform to schema or semantic constraints + +##### Response Body + +```json +{ + "EncProviderRecordKeys": [ + "EBxdYDhd.....", + "IOknr9DK.....", + ] +} +``` + +Where: + +- `EncProviderRecordKeys` a list of base58 encoded `EncProviderRecordKey`; + +#### `GET /routing/v1/encrypted/metadata/{HashProviderRecordKey}` + +##### Response codes + +- `200` (OK): the response body contains 1 record +- `404` (Not Found): must be returned if no matching records are found +- `422` (Unprocessable Entity): request does not conform to schema or semantic constraints + +##### Response Body + +```json +{ + "EncMetadata": "EBxdYDhd....." +} +``` + +Where: + +- `EncMetadatas` is base58 encoded `EncMetadata`; + From f01558f29370368cb939ab6fafe09f8d24b2fd56 Mon Sep 17 00:00:00 2001 From: Ivan Schasny Date: Wed, 7 Jun 2023 14:25:31 +0100 Subject: [PATCH 02/13] Update with PR number --- src/ipips/ipip-XXX.md | 79 ------------------------------------------- 1 file changed, 79 deletions(-) delete mode 100644 src/ipips/ipip-XXX.md diff --git a/src/ipips/ipip-XXX.md b/src/ipips/ipip-XXX.md deleted file mode 100644 index 094fced89..000000000 --- a/src/ipips/ipip-XXX.md +++ /dev/null @@ -1,79 +0,0 @@ ---- -title: "IPIP-XXX: HTTP Delegated Routing Reader Privacy Upgrade" -date: 2023-05-31 -ipip: ratified -editors: - - name: Andrew Gillis - github: gammazero - - name: Ivan Schasny - github: ischasny - - name: Masih Derkani - github: masih - - name: Will Scott - github: willscott -order: XXX -tags: ['ipips', 'routing', 'privacy', 'double hashing'] ---- - -## Summary - -This IPIP specifies new HTTP API for Privacy Preserving Delegated Content Routing provider lookups. - -## Motivation - -IPFS is currently lacking of many privacy protections. One of its main weak points lies in the lack -of privacy protections for the Content Routing subsystem. Currently neither Readers (clients accessing files) -nor Writers (hosts storing and distributing content) have much privacy with regard to content they publish or -consume. It is very easy for a Content Router or a Passive Observer to learn which file is requested by -which client during the routing process, as the potential adversary easily learns about the requested `CID`. -A curious actor could request the same `CID` and download the associated file to monitor the user’s behavior. -This is obviously undesirable and has been for some time now a strong request from the community. - -The latest upgrades to the DHT and IPNI have introduced Double Hashing - a technique that aims to better preserve Reader Privacy. -With Double Hashing in place Provider Records are encrypted and opaque to Content Routers. If presented with the original `CID` a -Content Router can decrypt the relevant Provider Records and serve them via the existing Delegated Routing API. -However in order to benefit from the privacy enhancement users need to change the way they interact with Content Routers, in particular: -- A second hash over the original `Multihash` must be used when looking up the content; -- Returned Provider Records are encrypted and must be decrypted by the client before using them; -- The client might choose to fetch additional encrypted Metadata from the Content Router. - -This new way of interaction can not be fullfilled by the existing API. This IPIP is an incremental improvement to the HTTP Delegated Routing API that adds -new endpoints for serving encrypted content. The original API can still be used for not Privacy Preserving lookups. - -Writer Privacy is out of scope of this IPIP and is going to be addressed separately. - -## Detailed design - -See the Delegated Routing Reader Privacy Upgrade spec (:cite[http-routing-reader-privacy-v1]) included with this IPIP. - -## Design rationale - -This API proposal makes the following changes: -- Adds new methods for looking up encrypted Provider Records and encrypted Metadata; -- Defines Hashing and Encryption functions and response payloads structure. - -There are no ideomatic changes to the API - all data formats, design rationale and principles outlined in the original [HTTP Delegated Routing IPIP](./ipip-0337.md) apply here. - -### User benefit - -With the new APIs users can protect themselves from: -- a malicious actor spying on the user by observing the user to Content Router traffic and then downloading the same data; -- the new API is a first step towards fully private HTTP Delegated Routing protocol that will eliminate IPNI as centralised observers. - -There are no other functional improvements. - -### Compatibility - -#### Backwards Compatibility - -The new API will be implemented in [go-delegated-routing](https://github.com/ipfs/boxo/tree/main/routing/http) and will not introduce any breaking changes. -The API will be released in a new minor version. - -### Resources - -- [IPIP-272 (double hashed DHT)](https://github.com/ipfs/specs/pull/373/) -- [ipni#5 (reader privacy in indexers)](https://github.com/ipni/specs/pull/5) - -### Copyright - -Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/). From f857f339a6326a541bf2321449690a221231f458 Mon Sep 17 00:00:00 2001 From: Ivan Schasny Date: Wed, 7 Jun 2023 14:25:50 +0100 Subject: [PATCH 03/13] Update with the PR number --- src/ipips/ipip-0421.md | 79 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 79 insertions(+) create mode 100644 src/ipips/ipip-0421.md diff --git a/src/ipips/ipip-0421.md b/src/ipips/ipip-0421.md new file mode 100644 index 000000000..d84ed8b4e --- /dev/null +++ b/src/ipips/ipip-0421.md @@ -0,0 +1,79 @@ +--- +title: "IPIP-0421: HTTP Delegated Routing Reader Privacy Upgrade" +date: 2023-05-31 +ipip: ratified +editors: + - name: Andrew Gillis + github: gammazero + - name: Ivan Schasny + github: ischasny + - name: Masih Derkani + github: masih + - name: Will Scott + github: willscott +order: XXX +tags: ['ipips', 'routing', 'privacy', 'double hashing'] +--- + +## Summary + +This IPIP specifies new HTTP API for Privacy Preserving Delegated Content Routing provider lookups. + +## Motivation + +IPFS is currently lacking of many privacy protections. One of its main weak points lies in the lack +of privacy protections for the Content Routing subsystem. Currently neither Readers (clients accessing files) +nor Writers (hosts storing and distributing content) have much privacy with regard to content they publish or +consume. It is very easy for a Content Router or a Passive Observer to learn which file is requested by +which client during the routing process, as the potential adversary easily learns about the requested `CID`. +A curious actor could request the same `CID` and download the associated file to monitor the user’s behavior. +This is obviously undesirable and has been for some time now a strong request from the community. + +The latest upgrades to the DHT and IPNI have introduced Double Hashing - a technique that aims to better preserve Reader Privacy. +With Double Hashing in place Provider Records are encrypted and opaque to Content Routers. If presented with the original `CID` a +Content Router can decrypt the relevant Provider Records and serve them via the existing Delegated Routing API. +However in order to benefit from the privacy enhancement users need to change the way they interact with Content Routers, in particular: +- A second hash over the original `Multihash` must be used when looking up the content; +- Returned Provider Records are encrypted and must be decrypted by the client before using them; +- The client might choose to fetch additional encrypted Metadata from the Content Router. + +This new way of interaction can not be fullfilled by the existing API. This IPIP is an incremental improvement to the HTTP Delegated Routing API that adds +new endpoints for serving encrypted content. The original API can still be used for not Privacy Preserving lookups. + +Writer Privacy is out of scope of this IPIP and is going to be addressed separately. + +## Detailed design + +See the Delegated Routing Reader Privacy Upgrade spec (:cite[http-routing-reader-privacy-v1]) included with this IPIP. + +## Design rationale + +This API proposal makes the following changes: +- Adds new methods for looking up encrypted Provider Records and encrypted Metadata; +- Defines Hashing and Encryption functions and response payloads structure. + +There are no ideomatic changes to the API - all data formats, design rationale and principles outlined in the original [HTTP Delegated Routing IPIP](./ipip-0337.md) apply here. + +### User benefit + +With the new APIs users can protect themselves from: +- a malicious actor spying on the user by observing the user to Content Router traffic and then downloading the same data; +- the new API is a first step towards fully private HTTP Delegated Routing protocol that will eliminate IPNI as centralised observers. + +There are no other functional improvements. + +### Compatibility + +#### Backwards Compatibility + +The new API will be implemented in [go-delegated-routing](https://github.com/ipfs/boxo/tree/main/routing/http) and will not introduce any breaking changes. +The API will be released in a new minor version. + +### Resources + +- [IPIP-272 (double hashed DHT)](https://github.com/ipfs/specs/pull/373/) +- [ipni#5 (reader privacy in indexers)](https://github.com/ipni/specs/pull/5) + +### Copyright + +Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/). From caede81dffa3608dc01d8d7799eff6a3ea019619 Mon Sep 17 00:00:00 2001 From: Ivan Schasny <31857042+ischasny@users.noreply.github.com> Date: Thu, 8 Jun 2023 10:22:03 +0100 Subject: [PATCH 04/13] Update src/ipips/ipip-0421.md Co-authored-by: Marcin Rataj --- src/ipips/ipip-0421.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/ipips/ipip-0421.md b/src/ipips/ipip-0421.md index d84ed8b4e..530c0ebc8 100644 --- a/src/ipips/ipip-0421.md +++ b/src/ipips/ipip-0421.md @@ -66,7 +66,7 @@ There are no other functional improvements. #### Backwards Compatibility -The new API will be implemented in [go-delegated-routing](https://github.com/ipfs/boxo/tree/main/routing/http) and will not introduce any breaking changes. +The `/routing/v1/encrypted/` API will be implemented in [`boxo/routing/http`](https://github.com/ipfs/boxo/tree/main/routing/http) and will not introduce any breaking changes to existing clear text endpoints. The API will be released in a new minor version. ### Resources From 07967b336a8f88b37cbd0314a85ab31636b70c50 Mon Sep 17 00:00:00 2001 From: Ivan Schasny <31857042+ischasny@users.noreply.github.com> Date: Thu, 8 Jun 2023 10:23:04 +0100 Subject: [PATCH 05/13] Update src/routing/http-routing-reader-privacy-v1.md Co-authored-by: Marcin Rataj --- src/routing/http-routing-reader-privacy-v1.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/routing/http-routing-reader-privacy-v1.md b/src/routing/http-routing-reader-privacy-v1.md index c2b79a064..9fd2aef77 100644 --- a/src/routing/http-routing-reader-privacy-v1.md +++ b/src/routing/http-routing-reader-privacy-v1.md @@ -48,7 +48,7 @@ be used while IPNI encrypts Advertisements on behaf of Publishers. However once digest of a hash function over some content. `MH` is represented as a 32-byte array. - **`HASH2`** is a second hash over the multihash. Second Hashes must be of `Multihash` format with `DBL_SHA_256` codec. The digest must be calculated as `hash(SALT_DOUBLEHASH || MH)`. -- **`ProviderRecord`** is a Provider Record as described in the [HTTP Delegated Routing Specification](http-routing-v1.md). +- **`ProviderRecord`** is a JSON with Provider Record as described in the [HTTP Delegated Routing Specification](http-routing-v1.md). - **`ProviderRecordKey`** is a concatentation of `peerID || contextID`. There is no need for explicitly encoding lengths as they are already encoded as a part of the multihash format. - **`EncProviderRecordKey`** is `Nonce || enc(deriveKey(multihash), Nonce, ProviderRecordKey)`. From f76c87c9ac8a640655d5cf1131bb5b0293aca762 Mon Sep 17 00:00:00 2001 From: Ivan Schasny Date: Thu, 8 Jun 2023 11:50:47 +0100 Subject: [PATCH 06/13] Review comments --- src/ipips/ipip-0421.md | 15 +++++++ src/routing/http-routing-reader-privacy-v1.md | 42 +++++++++---------- 2 files changed, 36 insertions(+), 21 deletions(-) diff --git a/src/ipips/ipip-0421.md b/src/ipips/ipip-0421.md index 530c0ebc8..5dca44c36 100644 --- a/src/ipips/ipip-0421.md +++ b/src/ipips/ipip-0421.md @@ -66,9 +66,24 @@ There are no other functional improvements. #### Backwards Compatibility +Users will need to explicitly turn on Reader Privacy on their nodes. A new flag can be introduced to the Kubo's HTTP Delegated Content Router configuration to facilitate that functionality. +Users on older nodes can continue using the old API and turn on reader Privacy at a alter point. + +Content Routers should provide the same QoS for both Privacy Preserving and regular APIs. This is because both can be served over the same encrypted data. If presented with a regular CID, a Content Router +can perform decryption operations on behalf of the user (i.e. mimic the client logic) and return results in clear text. If presented with a second hash the Content Router can return encrypted results and let the +user to do decryption themselves. + +It's possible that not all Content Routers will adopt Reader Privacy. The default HTTP Delegated Router like `cid.contact` should have Reader Privacy enabled by default in the newer versions of Kubo / Helia. +Users should verify themselves whether a custom router of their choice supports Reader Privacy or not when configuring it. + The `/routing/v1/encrypted/` API will be implemented in [`boxo/routing/http`](https://github.com/ipfs/boxo/tree/main/routing/http) and will not introduce any breaking changes to existing clear text endpoints. The API will be released in a new minor version. +#### Forwards Compatibility + +Reader Privacy relies on usage of specific hashing and encryption functions. Function rotation will require a network-wide migration. Content Routers might not be able to migrate "under the hood" as they +don't possess the original values. Function rotation should be a very infrequent event and will require network-wide efforts. When function rotation is needed - a version of the API will be incremented. + ### Resources - [IPIP-272 (double hashed DHT)](https://github.com/ipfs/specs/pull/373/) diff --git a/src/routing/http-routing-reader-privacy-v1.md b/src/routing/http-routing-reader-privacy-v1.md index 9fd2aef77..8a032c590 100644 --- a/src/routing/http-routing-reader-privacy-v1.md +++ b/src/routing/http-routing-reader-privacy-v1.md @@ -28,8 +28,11 @@ This specification describes a new HTTP API for Privacy Preserving Delegated Con All salts below are 64-bytes long, and represent a string padded with `\x00`. - `SALT_DOUBLEHASH = bytes("CR_DOUBLEHASH\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00")` - - `SALT_ENCRYPTIONKEY = bytes("CR_ENCRYPTIONKEY\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00")` - - `SALT_NONCE = bytes("CR_NONCE\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00")` + - `SALT_DOUBLEHASH = bytes("CR_ENCRYPTIONKEY\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00")` + + Magic values are needed to calculate different digests from the same value for different purposes. For example a hash of a Multihash that is used for lookups should be different from the one that is used for + key derivation, even though both are calculated from the same original value. In order to do that the Multihash is concatenated with different magic values before applying the hash funciton - `SALT_DOUBLEHASH` + for lookups and `SALT_DOUBLEHASH` for key derivation as described in the `Glossary`. ### Glossary @@ -37,36 +40,28 @@ This specification describes a new HTTP API for Privacy Preserving Delegated Con - **`hash`** is [SHA256](https://en.wikipedia.org/wiki/SHA-2) hashing. - **`||`** is concatenation of two values. - **`deriveKey`** is deriving a 32-byte encryption key from a passphrase that is done as `hash(SALT_ENCRYPTIONKEY || passphrase)`. -- **`Nonce`** is a 12-byte nonce used as Initialization Vector (IV) for the AESGCM encryption. IPNI expects an explicit instruction to delete a record (comparing to the DHT where records expire). -Hence the IPNI server needs to be able to compare encrypted values without having to decrypt them as that would require a key that it is unaware of. -That means that the nonce has to be deterministically chosen so that `enc(passphrase, nonce, payload)` produces the same output for the same -`passpharase` + `payload` pair. Nonce must be calculated as `hash(SALT_NONCE || passphrase || len(payload) || payload)[:12]`, where `len(payload)` is -an 8-byte length of the `payload` encoded in Little Endian format. Choice of nonce is not enforced by the IPNI specification. The described approach will -be used while IPNI encrypts Advertisements on behaf of Publishers. However once Writer Privacy is implemented, the choice of nonce will be left up to the Publisher. - **`CID`** is the [Content IDentifier](https://github.com/multiformats/cid). - **`MH`** is the [Multihash](https://github.com/multiformats/multihash) contained in a `CID`. It corresponds to the -digest of a hash function over some content. `MH` is represented as a 32-byte array. +digest of a hash function over some content. - **`HASH2`** is a second hash over the multihash. Second Hashes must be of `Multihash` format with `DBL_SHA_256` codec. The digest must be calculated as `hash(SALT_DOUBLEHASH || MH)`. - **`ProviderRecord`** is a JSON with Provider Record as described in the [HTTP Delegated Routing Specification](http-routing-v1.md). - **`ProviderRecordKey`** is a concatentation of `peerID || contextID`. There is no need for explicitly encoding lengths as they are -already encoded as a part of the multihash format. -- **`EncProviderRecordKey`** is `Nonce || enc(deriveKey(multihash), Nonce, ProviderRecordKey)`. +already encoded as a part of the multihash format. Max `contextID` length is 64 bytes. +- **`EncProviderRecordKey`** is `Nonce || enc(deriveKey(multihash), Nonce, ProviderRecordKey)`. Max `EncProviderRecordKey` is 200 bytes. - **`HashProviderRecordKey`** is a hash over `ProviderRecordKey` that must be calculated as `hash(SALT_DOUBLEHASH || ProviderRecordKey)`. -- **`Metadata`** is free form bytes that can represent such information such as IPNI metadata. -- **`EncMetadata`** is `Nonce || enc(deriveKey(ProviderRecordKey), Nonce, Metadata)`. - -### API +- **`Metadata`** is free form bytes that can represent such information such as IPNI metadata. Max `Metadata` length is 1024 bytes. +- **`EncMetadata`** is `Nonce || enc(deriveKey(ProviderRecordKey), Nonce, Metadata)`. Max `EncMetadata` length is 2000 bytes. -Assembling a full `ProviderRecord` from the encrypted data will require multiple roundtrips to the server. The first one to fetch a list of `EncProviderRecordKey`s and then one per -`EncProviderRecordKey` to fetch `EncMetadata`. In order to reduce the number of roundtrips to one the client implementation should use the local libp2p peerstore for multiaddress discovery -and [libp2p multistream select](https://github.com/multiformats/multistream-select) for protocol negotiation. +Note: maximum allowed lengths might change without incrementing the API version. Such fields as `contextID` or `Metadata` are free-form bytes and +their maximum lengths can be changed in the underlying protocols. +### API #### `GET /routing/v1/encrypted/providers/{HASH2}` ##### Response codes -- `200` (OK): the response body contains 0 or more records +- `200` (OK): the response body contains 1 or more records - `404` (Not Found): must be returned if no matching records are found - `422` (Unprocessable Entity): request does not conform to schema or semantic constraints @@ -83,7 +78,7 @@ and [libp2p multistream select](https://github.com/multiformats/multistream-sele Where: -- `EncProviderRecordKeys` a list of base58 encoded `EncProviderRecordKey`; +- `EncProviderRecordKeys` a list of base64 encoded `EncProviderRecordKey`; #### `GET /routing/v1/encrypted/metadata/{HashProviderRecordKey}` @@ -103,5 +98,10 @@ Where: Where: -- `EncMetadatas` is base58 encoded `EncMetadata`; +- `EncMetadatas` is base64 encoded `EncMetadata`; + +### Notes +Assembling a full `ProviderRecord` from the encrypted data will require multiple roundtrips to the server. The first one to fetch a list of `EncProviderRecordKey`s and then one per +`EncProviderRecordKey` to fetch `EncMetadata`. In order to reduce the number of roundtrips to one the client implementation should use the local libp2p peerstore for multiaddress discovery +and [libp2p multistream select](https://github.com/multiformats/multistream-select) for protocol negotiation. \ No newline at end of file From 0d2948ecdbfae9c1af42683a29ceef9ceb7f3796 Mon Sep 17 00:00:00 2001 From: Marcin Rataj Date: Fri, 16 Jun 2023 19:56:41 +0200 Subject: [PATCH 07/13] chore(ipip-421): add missing sections --- src/ipips/ipip-0421.md | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/src/ipips/ipip-0421.md b/src/ipips/ipip-0421.md index 5dca44c36..002ecd7a9 100644 --- a/src/ipips/ipip-0421.md +++ b/src/ipips/ipip-0421.md @@ -84,6 +84,27 @@ The API will be released in a new minor version. Reader Privacy relies on usage of specific hashing and encryption functions. Function rotation will require a network-wide migration. Content Routers might not be able to migrate "under the hood" as they don't possess the original values. Function rotation should be a very infrequent event and will require network-wide efforts. When function rotation is needed - a version of the API will be incremented. +### User benefit + +TODO: How will end users benefit from this work? + +### Security + +TODO: Explain the security implications/considerations relevant to the proposed change. + +### Alternatives + +TODO: Describe alternate designs that were considered and related work. + +- TODO: Oblivious HTTP ([IETF](https://www.ietf.org/archive/id/draft-thomson-http-oblivious-01.html), [Cloudflare](https://blog.cloudflare.com/stronger-than-a-promise-proving-oblivious-http-privacy-properties/)) + +## Test fixtures + +TODO: List relevant CIDs (if any). Describe how implementations can use them to determine +specification compliance. This section can be skipped if IPIP does not deal +with the way IPFS handles content-addressed data, or the modified specification +file already includes this information. + ### Resources - [IPIP-272 (double hashed DHT)](https://github.com/ipfs/specs/pull/373/) From 6c76a33ed4df279ce6e2ba2dec79dc174b527028 Mon Sep 17 00:00:00 2001 From: Marcin Rataj Date: Fri, 16 Jun 2023 20:19:57 +0200 Subject: [PATCH 08/13] ipip-421: editorials --- src/ipips/ipip-0421.md | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/src/ipips/ipip-0421.md b/src/ipips/ipip-0421.md index 002ecd7a9..0f1ee1188 100644 --- a/src/ipips/ipip-0421.md +++ b/src/ipips/ipip-0421.md @@ -76,7 +76,7 @@ user to do decryption themselves. It's possible that not all Content Routers will adopt Reader Privacy. The default HTTP Delegated Router like `cid.contact` should have Reader Privacy enabled by default in the newer versions of Kubo / Helia. Users should verify themselves whether a custom router of their choice supports Reader Privacy or not when configuring it. -The `/routing/v1/encrypted/` API will be implemented in [`boxo/routing/http`](https://github.com/ipfs/boxo/tree/main/routing/http) and will not introduce any breaking changes to existing clear text endpoints. +The `/routing/v1/encrypted/` API will be implemented in existing libraries like [`boxo/routing/http`](https://github.com/ipfs/boxo/tree/main/routing/http) and will not introduce any breaking changes to existing clear text endpoints. The API will be released in a new minor version. #### Forwards Compatibility @@ -84,13 +84,9 @@ The API will be released in a new minor version. Reader Privacy relies on usage of specific hashing and encryption functions. Function rotation will require a network-wide migration. Content Routers might not be able to migrate "under the hood" as they don't possess the original values. Function rotation should be a very infrequent event and will require network-wide efforts. When function rotation is needed - a version of the API will be incremented. -### User benefit - -TODO: How will end users benefit from this work? - ### Security -TODO: Explain the security implications/considerations relevant to the proposed change. +See "Threat Modelling" section of TODO ### Alternatives From 2d800ad9d94fd00563f53af2e21269072b48f518 Mon Sep 17 00:00:00 2001 From: Marcin Rataj Date: Fri, 16 Jun 2023 20:23:34 +0200 Subject: [PATCH 09/13] ipip-421: fix typo --- src/routing/http-routing-reader-privacy-v1.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/routing/http-routing-reader-privacy-v1.md b/src/routing/http-routing-reader-privacy-v1.md index 8a032c590..694d828a4 100644 --- a/src/routing/http-routing-reader-privacy-v1.md +++ b/src/routing/http-routing-reader-privacy-v1.md @@ -28,11 +28,11 @@ This specification describes a new HTTP API for Privacy Preserving Delegated Con All salts below are 64-bytes long, and represent a string padded with `\x00`. - `SALT_DOUBLEHASH = bytes("CR_DOUBLEHASH\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00")` - - `SALT_DOUBLEHASH = bytes("CR_ENCRYPTIONKEY\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00")` + - `SALT_ENCRYPTIONKEY = bytes("CR_ENCRYPTIONKEY\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00")` Magic values are needed to calculate different digests from the same value for different purposes. For example a hash of a Multihash that is used for lookups should be different from the one that is used for key derivation, even though both are calculated from the same original value. In order to do that the Multihash is concatenated with different magic values before applying the hash funciton - `SALT_DOUBLEHASH` - for lookups and `SALT_DOUBLEHASH` for key derivation as described in the `Glossary`. + for lookups and `SALT_ENCRYPTIONKEY` for key derivation as described in the `Glossary`. ### Glossary From 18b258e791d63b492bfeb805eb0ba3a9305cfa2d Mon Sep 17 00:00:00 2001 From: Marcin Rataj Date: Fri, 16 Jun 2023 20:39:28 +0200 Subject: [PATCH 10/13] ipip-421: editorials --- src/ipips/ipip-0421.md | 10 +++++----- src/routing/http-routing-reader-privacy-v1.md | 6 +++++- 2 files changed, 10 insertions(+), 6 deletions(-) diff --git a/src/ipips/ipip-0421.md b/src/ipips/ipip-0421.md index 0f1ee1188..3243a9e13 100644 --- a/src/ipips/ipip-0421.md +++ b/src/ipips/ipip-0421.md @@ -1,7 +1,7 @@ --- title: "IPIP-0421: HTTP Delegated Routing Reader Privacy Upgrade" date: 2023-05-31 -ipip: ratified +ipip: proposal editors: - name: Andrew Gillis github: gammazero @@ -11,7 +11,7 @@ editors: github: masih - name: Will Scott github: willscott -order: XXX +order: 421 tags: ['ipips', 'routing', 'privacy', 'double hashing'] --- @@ -52,7 +52,7 @@ This API proposal makes the following changes: - Adds new methods for looking up encrypted Provider Records and encrypted Metadata; - Defines Hashing and Encryption functions and response payloads structure. -There are no ideomatic changes to the API - all data formats, design rationale and principles outlined in the original [HTTP Delegated Routing IPIP](./ipip-0337.md) apply here. +There are no ideomatic changes to the API - all data formats, design rationale and principles outlined in the original :cite[ipip-0337] apply here. ### User benefit @@ -86,7 +86,7 @@ don't possess the original values. Function rotation should be a very infrequent ### Security -See "Threat Modelling" section of TODO +See "Threat Modelling" section of :cite[http-routing-reader-privacy-v1] ### Alternatives @@ -96,7 +96,7 @@ TODO: Describe alternate designs that were considered and related work. ## Test fixtures -TODO: List relevant CIDs (if any). Describe how implementations can use them to determine +TODO: List relevant CIDs or JSON payloads. Describe how implementations can use them to determine specification compliance. This section can be skipped if IPIP does not deal with the way IPFS handles content-addressed data, or the modified specification file already includes this information. diff --git a/src/routing/http-routing-reader-privacy-v1.md b/src/routing/http-routing-reader-privacy-v1.md index 694d828a4..1a349f634 100644 --- a/src/routing/http-routing-reader-privacy-v1.md +++ b/src/routing/http-routing-reader-privacy-v1.md @@ -53,9 +53,13 @@ already encoded as a part of the multihash format. Max `contextID` length is 64 - **`Metadata`** is free form bytes that can represent such information such as IPNI metadata. Max `Metadata` length is 1024 bytes. - **`EncMetadata`** is `Nonce || enc(deriveKey(ProviderRecordKey), Nonce, Metadata)`. Max `EncMetadata` length is 2000 bytes. -Note: maximum allowed lengths might change without incrementing the API version. Such fields as `contextID` or `Metadata` are free-form bytes and +:::note + +Maximum allowed lengths might change without incrementing the API version. Such fields as `contextID` or `Metadata` are free-form bytes and their maximum lengths can be changed in the underlying protocols. +::: + ### API #### `GET /routing/v1/encrypted/providers/{HASH2}` From 25242f6d3be7d3a4619c6c43b0ecf82bbe44ade5 Mon Sep 17 00:00:00 2001 From: "Masih H. Derkani" Date: Wed, 12 Jul 2023 15:46:48 +0100 Subject: [PATCH 11/13] Test write access to branch --- src/ipips/ipip-0421.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/ipips/ipip-0421.md b/src/ipips/ipip-0421.md index 3243a9e13..4b4a77e00 100644 --- a/src/ipips/ipip-0421.md +++ b/src/ipips/ipip-0421.md @@ -17,7 +17,7 @@ tags: ['ipips', 'routing', 'privacy', 'double hashing'] ## Summary -This IPIP specifies new HTTP API for Privacy Preserving Delegated Content Routing provider lookups. +This IPIP specifies a new HTTP API for Privacy Preserving Delegated Content Routing provider lookups. ## Motivation From b8942796e3ff07b82bf67e4f96fa6fadd3aea572 Mon Sep 17 00:00:00 2001 From: "Masih H. Derkani" Date: Wed, 12 Jul 2023 16:26:15 +0100 Subject: [PATCH 12/13] Rewrite IPIP document to reflect on comments and refine text * Refine paragraphs for better readability. * Change section on router selection based on code, since from multihash code alone we cannot determine wheterh whether the request is encrypted or not. * Update alternatives section to explain how the IPIP can be enhanced with OHTTP and Tor. --- src/ipips/ipip-0421.md | 86 ++++++++++++++++++------------------------ 1 file changed, 36 insertions(+), 50 deletions(-) diff --git a/src/ipips/ipip-0421.md b/src/ipips/ipip-0421.md index 4b4a77e00..74d92918c 100644 --- a/src/ipips/ipip-0421.md +++ b/src/ipips/ipip-0421.md @@ -17,94 +17,80 @@ tags: ['ipips', 'routing', 'privacy', 'double hashing'] ## Summary -This IPIP specifies a new HTTP API for Privacy Preserving Delegated Content Routing provider lookups. +This IPIP introduces a HTTP API designed for Privacy Preserving Delegated Content Routing provider lookups. ## Motivation -IPFS is currently lacking of many privacy protections. One of its main weak points lies in the lack -of privacy protections for the Content Routing subsystem. Currently neither Readers (clients accessing files) -nor Writers (hosts storing and distributing content) have much privacy with regard to content they publish or -consume. It is very easy for a Content Router or a Passive Observer to learn which file is requested by -which client during the routing process, as the potential adversary easily learns about the requested `CID`. -A curious actor could request the same `CID` and download the associated file to monitor the user’s behavior. -This is obviously undesirable and has been for some time now a strong request from the community. +Currently, IPFS's privacy safeguards are notably deficient, particularly regarding the Content Routing subsystem. Neither Readers (clients who access files) nor Writers (hosts that store and distribute content) can maintain significant privacy related to the content they produce or consume. Presently, a Content Router or a Passive Observer can discern the identity of a file requested by a client and the specific client making the request during the routing process. This situation allows potential adversaries to gain knowledge about the requested CID. An interested party could then request the same CID and download the corresponding file to track the user's activities. Addressing these privacy concerns has been a long-standing demand from the community. -The latest upgrades to the DHT and IPNI have introduced Double Hashing - a technique that aims to better preserve Reader Privacy. -With Double Hashing in place Provider Records are encrypted and opaque to Content Routers. If presented with the original `CID` a -Content Router can decrypt the relevant Provider Records and serve them via the existing Delegated Routing API. -However in order to benefit from the privacy enhancement users need to change the way they interact with Content Routers, in particular: -- A second hash over the original `Multihash` must be used when looking up the content; -- Returned Provider Records are encrypted and must be decrypted by the client before using them; -- The client might choose to fetch additional encrypted Metadata from the Content Router. +Recent enhancements to the [IPFS DHT](https://github.com/ipfs/specs/pull/373) and [InterPlanetary Network Indexer (IPNI)](https://github.com/ipni/specs/pull/5) have incorporated Double Hashing to improve Reader Privacy. With Double Hashing, Provider Records become encrypted and non-transparent to Content Routers. Given the original CID, a Content Router can decrypt the relevant Provider Records and supply them through the existing Delegated Routing API. To make use of these privacy enhancements, users must modify their interactions with Content Routers by: -This new way of interaction can not be fullfilled by the existing API. This IPIP is an incremental improvement to the HTTP Delegated Routing API that adds -new endpoints for serving encrypted content. The original API can still be used for not Privacy Preserving lookups. +* Utilizing a secondary hash over the original Multihash during content lookup; +* Decrypting the returned, encrypted Provider Records prior to use; and +* Optionally retrieving additional encrypted Metadata from the Content Router. -Writer Privacy is out of scope of this IPIP and is going to be addressed separately. +Existing APIs cannot support these changes in interaction, necessitating this IPIP as a step to improve the HTTP Delegated Routing API. This proposal adds new endpoints for delivering encrypted content while maintaining the original API for non-privacy-preserving lookups. Writer Privacy, however, is not within the scope of this IPIP and will be handled separately. ## Detailed design -See the Delegated Routing Reader Privacy Upgrade spec (:cite[http-routing-reader-privacy-v1]) included with this IPIP. +Please refer to the Delegated Routing Reader Privacy Upgrade specification (:cite[http-routing-reader-privacy-v1]) included with this IPIP for detailed design information. ## Design rationale -This API proposal makes the following changes: -- Adds new methods for looking up encrypted Provider Records and encrypted Metadata; -- Defines Hashing and Encryption functions and response payloads structure. +The proposed API makes two key changes: -There are no ideomatic changes to the API - all data formats, design rationale and principles outlined in the original :cite[ipip-0337] apply here. +1. It introduces new methods for looking up encrypted Provider Records and encrypted Metadata. +2. It establishes Hashing and Encryption functions and structures the response payloads. -### User benefit +This proposal does not alter the API's idioms, upholding all data formats, design rationale, and principles established in the original :cite[ipip-0337]. -With the new APIs users can protect themselves from: -- a malicious actor spying on the user by observing the user to Content Router traffic and then downloading the same data; -- the new API is a first step towards fully private HTTP Delegated Routing protocol that will eliminate IPNI as centralised observers. +### User benefit -There are no other functional improvements. +With the proposed APIs, users can protect themselves against malicious actors who might spy on their activities by monitoring their traffic to Content Routers and subsequently downloading identical data. Additionally, this API serves as a first step towards a fully private HTTP Delegated Routing protocol, which would eliminate centralized observers like IPNI routers. ### Compatibility #### Backwards Compatibility -Users will need to explicitly turn on Reader Privacy on their nodes. A new flag can be introduced to the Kubo's HTTP Delegated Content Router configuration to facilitate that functionality. -Users on older nodes can continue using the old API and turn on reader Privacy at a alter point. +Users will need to deliberately activate Reader Privacy on their nodes. A new flag could be introduced into IPFS implementations such as Kubo's HTTP Delegated Content Router configuration to streamline this process. Users on older nodes can continue using the existing API and switch on Reader Privacy later. -Content Routers should provide the same QoS for both Privacy Preserving and regular APIs. This is because both can be served over the same encrypted data. If presented with a regular CID, a Content Router -can perform decryption operations on behalf of the user (i.e. mimic the client logic) and return results in clear text. If presented with a second hash the Content Router can return encrypted results and let the -user to do decryption themselves. +Content Routers should maintain the same Quality of Service (QoS) for both Privacy Preserving and regular APIs, as both can be served over the same encrypted data. A shim non-encrypted content router can be implemented to encrypt regular CIDs on the fly, proxy the requests through an encrypted content router and finally decrypt the results before returning them to the user. -It's possible that not all Content Routers will adopt Reader Privacy. The default HTTP Delegated Router like `cid.contact` should have Reader Privacy enabled by default in the newer versions of Kubo / Helia. -Users should verify themselves whether a custom router of their choice supports Reader Privacy or not when configuring it. +It is worth noting that not all Content Routers might adopt Reader Privacy. Default HTTP Delegated Routers like `cid.contact` should have Reader Privacy enabled by default in the latest versions of IPFS implementations such as Kubo and Helia. Users should confirm if their chosen custom router supports Reader Privacy when setting it up. -The `/routing/v1/encrypted/` API will be implemented in existing libraries like [`boxo/routing/http`](https://github.com/ipfs/boxo/tree/main/routing/http) and will not introduce any breaking changes to existing clear text endpoints. -The API will be released in a new minor version. +The `/routing/v1/encrypted/` API will be implemented in existing libraries, such as [`boxo/routing/http`](https://github.com/ipfs/boxo/tree/main/routing/http), and will not introduce any breaking changes to existing clear text endpoints. The API will be introduced in a new minor version. -#### Forwards Compatibility +#### Forward Compatibility -Reader Privacy relies on usage of specific hashing and encryption functions. Function rotation will require a network-wide migration. Content Routers might not be able to migrate "under the hood" as they -don't possess the original values. Function rotation should be a very infrequent event and will require network-wide efforts. When function rotation is needed - a version of the API will be incremented. +Reader Privacy relies on the use of specific hashing and encryption functions. Altering these functions would require a network-wide migration. Content Routers might not be able to migrate seamlessly, as they do not possess the original values. Such function rotation should occur infrequently and necessitate network-wide efforts. When function rotation is required, the API version will be incremented. ### Security -See "Threat Modelling" section of :cite[http-routing-reader-privacy-v1] +For details on security, please see the "Threat Modelling" section of :cite[http-routing-reader-privacy-v1]. ### Alternatives -TODO: Describe alternate designs that were considered and related work. +When considering alternatives to this IPIP, two potential scenarios and their corresponding technologies are worth exploring: + +1. Oblivious HTTP (OHTTP) +2. Onion Services + +In scenario (1), `/routing/v1` would be implemented behind Oblivious HTTP (OHTTP), a protocol proposed by IETF and Cloudflare. OHTTP separates the information about 'who' is making a request from 'what' they are requesting, thereby preventing routing systems such as IPNI instances from viewing both pieces of information concurrently. This would add an additional layer of privacy by obscuring metadata, such as user behavior patterns, IP addresses, and user-agents. + +Scenario (2) envisages the `/routing/v1` behind Onion Services. Onion Services provide another approach to concealing the origin of requests by routing them through the Tor network, further enhancing user privacy. -- TODO: Oblivious HTTP ([IETF](https://www.ietf.org/archive/id/draft-thomson-http-oblivious-01.html), [Cloudflare](https://blog.cloudflare.com/stronger-than-a-promise-proving-oblivious-http-privacy-properties/)) +These two scenarios and their corresponding technologies aren't mutually exclusive to this IPIP. Instead, they could be viewed as complementary solutions that could be deployed in conjunction with Double Hashed records, as proposed in this IPIP, to create a more comprehensive privacy solution. The Double Hashing technique encrypts the content of the communication, making it opaque to passive observers. Simultaneously, OHTTP and Onion Services could provide additional privacy layers by obfuscating metadata about who is making a request. -## Test fixtures +For more information on OHTTP and Onion Services, please refer to these resources: -TODO: List relevant CIDs or JSON payloads. Describe how implementations can use them to determine -specification compliance. This section can be skipped if IPIP does not deal -with the way IPFS handles content-addressed data, or the modified specification -file already includes this information. +- [Oblivious HTTP: IETF](https://www.ietf.org/archive/id/draft-thomson-http-oblivious-01.html) +- [Oblivious HTTP: Cloudflare](https://blog.cloudflare.com/stronger-than-a-promise-proving-oblivious-http-privacy-properties/) +- [Onion Services](https://community.torproject.org/onion-services/) ### Resources -- [IPIP-272 (double hashed DHT)](https://github.com/ipfs/specs/pull/373/) -- [ipni#5 (reader privacy in indexers)](https://github.com/ipni/specs/pull/5) +- [Double-hashed DHT](https://github.com/ipfs/specs/pull/373/) +- [Reader Privacy in Indexers](https://github.com/ipni/specs/pull/5) ### Copyright From 0195260fed24eb9c8aad7d24173960f72764d8b7 Mon Sep 17 00:00:00 2001 From: "Masih H. Derkani" Date: Wed, 12 Jul 2023 17:16:12 +0100 Subject: [PATCH 13/13] Refine routing specification and add byte frame diagrams Refine routing specification and add byte frame diagram to clearly illustrate the content of SALT values. --- src/routing/http-routing-reader-privacy-v1.md | 95 +++++++++++-------- 1 file changed, 56 insertions(+), 39 deletions(-) diff --git a/src/routing/http-routing-reader-privacy-v1.md b/src/routing/http-routing-reader-privacy-v1.md index 1a349f634..ebbc183cc 100644 --- a/src/routing/http-routing-reader-privacy-v1.md +++ b/src/routing/http-routing-reader-privacy-v1.md @@ -1,63 +1,80 @@ --- title: Routing V1 HTTP Delegated Routing Reader Privacy Upgrade description: > - This specification describes Delegated Routing Reader Privacy Upgrade. It's an - incremental improvement to HTTP Delegated Routing API and inherits all of its - formats and design rationale. + This specification outlines the Delegated Routing Reader Privacy Upgrade, representing an incremental enhancement to the HTTP Delegated Routing API. It seamlessly integrates with the existing API, adopting its formats and design principles, to ensure continuity and coherence while offering improved privacy protections. date: 2023-05-31 maturity: reliable editors: - name: Andrew Gillis github: gammazero - name: Ivan Schasny - github: ischasny + github: ischasny - name: Masih Derkani github: masih - name: Will Scott github: willscott -order: 0 -tags: ['routing', 'double hashing', 'privacy'] +order: 1 +tags: [ 'routing', 'double hashing', 'privacy' ] --- -This specification describes a new HTTP API for Privacy Preserving Delegated Content Routing provider lookups. It's an extension to HTTP Delegated Routing API and inherits all of its formats and design rationale. +This specification details the implementation of a new HTTP API for Privacy Preserving Delegated Content Routing provider lookups. It represents an expansion of the HTTP Delegated Routing API, embracing its formats and design principles. ## API Specification ### Magic Values - All salts below are 64-bytes long, and represent a string padded with `\x00`. +All salts below are 64-bytes long and represent a string padded with `\x00`. - - `SALT_DOUBLEHASH = bytes("CR_DOUBLEHASH\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00")` - - `SALT_ENCRYPTIONKEY = bytes("CR_ENCRYPTIONKEY\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00")` +- `SALT_DOUBLEHASH`: The string value `CR_DOUBLEHASH`, where each if the 13 characters are represented by their byte value. The remainder of the 64 bytes is filled with null bytes represented by `\x00`. This results in 51 null bytes after the `CR_DOUBLEHASH` string. The following illustrates its corresponding byte frame diagram: - Magic values are needed to calculate different digests from the same value for different purposes. For example a hash of a Multihash that is used for lookups should be different from the one that is used for - key derivation, even though both are calculated from the same original value. In order to do that the Multihash is concatenated with different magic values before applying the hash funciton - `SALT_DOUBLEHASH` - for lookups and `SALT_ENCRYPTIONKEY` for key derivation as described in the `Glossary`. + ``` + +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ + | C | R | _ | D | O | U | B | L | E | H | A | S | H | \x00...\x00 | + +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ + <---------------------------- 64 Bytes ---------------------------> + ``` + For reference, the following snippet represents the hex dump of the above, where each character of `CR_DOUBLEHASH` is represented by its ASCII hexadecimal equivalent, and the null bytes are represented by "00": + + ``` + 43 52 5F 44 4F 55 42 4C 45 48 41 53 48 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 + ``` + +- `SALT_ENCRYPTIONKEY`: The string value `CR_ENCRYPTIONKEY`, where each if the 15 characters are represented by their byte value. The remainder of the 64 bytes is filled with null bytes represented by `\x00`. This results in 49 null bytes after the `CR_ENCRYPTIONKEY` string. The following illustrates its corresponding byte frame diagram: + + ``` + +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ + | C | R | _ | E | N | C | R | Y | P | T | I | O | N | K | E | Y | + +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ + | \x00...\x00 | + +---+---+---+ + <---------------------------- 64 Bytes ---------------------------> + ``` + For reference, the following snippet represents the hex dump of the above, where each character of `CR_ENCRYPTIONKEY` is represented by its ASCII hexadecimal equivalent, and the null bytes are represented by "00": + + ``` + 43 52 5F 45 4E 43 52 59 50 54 49 4F 4E 4B 45 59 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 + ``` + +These magic values are utilized to compute distinct digests from identical values for varying purposes. For instance, a hash of a Multihash employed for lookups should differ from the one used for key derivation, despite originating from the same value. To achieve this, the Multihash is concatenated with different magic values before applying the hash function: `SALT_DOUBLEHASH` for lookups and `SALT_ENCRYPTIONKEY` for key derivation as elaborated in the `Glossary`. ### Glossary -- **`enc`** is [AESGCM](https://en.wikipedia.org/wiki/Galois/Counter_Mode) encryption. The following notation will be used for the rest of the specification `enc(passphrase, nonce, payload)`. -- **`hash`** is [SHA256](https://en.wikipedia.org/wiki/SHA-2) hashing. -- **`||`** is concatenation of two values. -- **`deriveKey`** is deriving a 32-byte encryption key from a passphrase that is done as `hash(SALT_ENCRYPTIONKEY || passphrase)`. -- **`CID`** is the [Content IDentifier](https://github.com/multiformats/cid). -- **`MH`** is the [Multihash](https://github.com/multiformats/multihash) contained in a `CID`. It corresponds to the -digest of a hash function over some content. -- **`HASH2`** is a second hash over the multihash. Second Hashes must be of `Multihash` format with `DBL_SHA_256` codec. -The digest must be calculated as `hash(SALT_DOUBLEHASH || MH)`. +- **`enc`** refers to [AESGCM](https://en.wikipedia.org/wiki/Galois/Counter_Mode) encryption. The notation `enc(passphrase, nonce, payload)` will be used henceforth in this specification. +- **`hash`** denotes [SHA256](https://en.wikipedia.org/wiki/SHA-2) hashing. +- **`||`** signifies concatenation of two values. +- **`deriveKey`** pertains to the derivation of a 32-byte encryption key from a passphrase, performed as `hash(SALT_ENCRYPTIONKEY || passphrase)`. +- **`CID`** stands for [Content IDentifier](https://github.com/multiformats/cid). +- **`MH`** refers to the [Multihash](https://github.com/multiformats/multihash) contained in a `CID`. It corresponds to the hash function's digest over certain content. +- **`HASH2`** is a second hash over the multihash. Second Hashes must follow the `Multihash` format with `SHA2_256` codec. The digest must be calculated as `hash(SALT_DOUBLEHASH || MH)`. - **`ProviderRecord`** is a JSON with Provider Record as described in the [HTTP Delegated Routing Specification](http-routing-v1.md). -- **`ProviderRecordKey`** is a concatentation of `peerID || contextID`. There is no need for explicitly encoding lengths as they are -already encoded as a part of the multihash format. Max `contextID` length is 64 bytes. -- **`EncProviderRecordKey`** is `Nonce || enc(deriveKey(multihash), Nonce, ProviderRecordKey)`. Max `EncProviderRecordKey` is 200 bytes. -- **`HashProviderRecordKey`** is a hash over `ProviderRecordKey` that must be calculated as `hash(SALT_DOUBLEHASH || ProviderRecordKey)`. -- **`Metadata`** is free form bytes that can represent such information such as IPNI metadata. Max `Metadata` length is 1024 bytes. +- **`ProviderRecordKey`** is a concatenation of `peerID || contextID`. Explicit encoding lengths are unnecessary as they are inherently encoded as part of the multihash format. Max `contextID` length is 64 bytes. +- **`EncProviderRecordKey`** is `Nonce || enc(deriveKey(multihash), Nonce, ProviderRecordKey)`. Max `EncProviderRecordKey` is 200 bytes. +- **`HashProviderRecordKey`** is a hash over `ProviderRecordKey`, calculated as `hash(SALT_DOUBLEHASH || ProviderRecordKey)`. +- **`Metadata`** are free-form bytes that can represent such information such as IPNI metadata. Max `Metadata` length is 1024 bytes. - **`EncMetadata`** is `Nonce || enc(deriveKey(ProviderRecordKey), Nonce, Metadata)`. Max `EncMetadata` length is 2000 bytes. :::note - -Maximum allowed lengths might change without incrementing the API version. Such fields as `contextID` or `Metadata` are free-form bytes and -their maximum lengths can be changed in the underlying protocols. - +Maximum allowed lengths may change without incrementing the API version. Such fields as `contextID` or `Metadata` are free-form bytes and their maximum lengths can be altered in the underlying protocols. ::: ### API @@ -65,7 +82,7 @@ their maximum lengths can be changed in the underlying protocols. ##### Response codes -- `200` (OK): the response body contains 1 or more records +- `200` (OK): the response body contains one or more records - `404` (Not Found): must be returned if no matching records are found - `422` (Unprocessable Entity): request does not conform to schema or semantic constraints @@ -75,20 +92,22 @@ their maximum lengths can be changed in the underlying protocols. { "EncProviderRecordKeys": [ "EBxdYDhd.....", - "IOknr9DK.....", + "IOknr9DK....." ] } + + ``` Where: -- `EncProviderRecordKeys` a list of base64 encoded `EncProviderRecordKey`; +- `EncProviderRecordKeys` is a list of base64 encoded `EncProviderRecordKey`; #### `GET /routing/v1/encrypted/metadata/{HashProviderRecordKey}` ##### Response codes -- `200` (OK): the response body contains 1 record +- `200` (OK): the response body contains one record - `404` (Not Found): must be returned if no matching records are found - `422` (Unprocessable Entity): request does not conform to schema or semantic constraints @@ -102,10 +121,8 @@ Where: Where: -- `EncMetadatas` is base64 encoded `EncMetadata`; +- `EncMetadata` is a base64 encoded `EncMetadata`; ### Notes -Assembling a full `ProviderRecord` from the encrypted data will require multiple roundtrips to the server. The first one to fetch a list of `EncProviderRecordKey`s and then one per -`EncProviderRecordKey` to fetch `EncMetadata`. In order to reduce the number of roundtrips to one the client implementation should use the local libp2p peerstore for multiaddress discovery -and [libp2p multistream select](https://github.com/multiformats/multistream-select) for protocol negotiation. \ No newline at end of file +Assembling a full `ProviderRecord` from the encrypted data requires multiple server roundtrips. The first fetches a list of `EncProviderRecordKey`s, followed by one for each `EncProviderRecordKey` to retrieve `EncMetadata`. To minimize the number of roundtrips to one, the client implementation should use the local libp2p peerstore for multiaddress discovery and [libp2p multistream select](https://github.com/multiformats/multistream-select) for protocol negotiation.