-
Notifications
You must be signed in to change notification settings - Fork 78
/
Copy pathspec.bs
283 lines (205 loc) · 19.8 KB
/
spec.bs
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
<pre class='metadata'>
Title: User Agent Interaction with Related Website Sets
Shortname: first-party-sets
Level: None
Status: w3c/CG-DRAFT
Group: WICG
URL: https://wicg.github.io/first-party-sets/
Editor: Chris Fredrickson, Google https://google.com, [email protected]
Editor: Kaustubha Govind, Google https://google.com, [email protected]
Editor: Johann Hofmann, Google https://google.com, [email protected]
Abstract: How user agents should integrate with Related Website Sets, a mechanism to declare a collection of related domains as being in a Related Website Set.
Markup Shorthands: markdown yes
Default Biblio Display: inline
</pre>
<pre class=link-defaults>
spec:webidl; type:dfn; text:resolve
</pre>
<pre class="anchors">
spec: PSL; urlPrefix: https://publicsuffix.org/list/
type: dfn
text: registered domain; url: #
spec: clear-site-data; urlPrefix: https://www.w3.org/TR/clear-site-data/#
type: dfn
text: clear cache for origin; url: abstract-opdef-clear-cache-for-origin
type: dfn
text: clear dom-accessible storage for origin; url: abstract-opdef-clear-dom-accessible-storage-for-origin
type: dfn
text: clear cookies for origin; url: abstract-opdef-clear-cookies-for-origin
spec: storage-access; urlPrex: https://privacycg.github.io/storage-access/#
type: dfn
text: determine the storage access policy; url: determine-the-storage-access-policy
<!-- Export PR is https://github.com/w3c/permissions/pull/407, but note that we will likely end up not using the permissions task source directly, see https://github.com/privacycg/storage-access/issues/144 -->
urlPrefix: https://w3c.github.io/permissions/; spec: permissions
text: permissions task source; url: #permissions-task-source; type: dfn
</pre>
<pre class="biblio">
{
"SUBMISSION-GUIDELINES": {
"href": "https://github.com/GoogleChrome/related-website-sets/blob/main/RWS-Submission_Guidelines.md",
"title": "Related Website Sets Submission Guidelines",
"publisher": "Google Chrome"
},
"RWS-LIST": {
"href": "https://github.com/GoogleChrome/first-party-sets/blob/main/first_party_sets.JSON",
"title": "Related Website Sets list",
"publisher": "Google Chrome"
},
"STORAGE-ACCESS": {
"href": "https://privacycg.github.io/storage-access/",
"title": "Storage Access API",
"status": "CG Draft",
"deliveredBy": [
"https://www.w3.org/community/privacycg/"
]
},
"CSRF": {
"href": "https://owasp.org/www-community/attacks/csrf",
"title": "Cross Site Request Forgery (CSRF)"
},
"CACHE-PROBING": {
"href": "https://xsleaks.dev/docs/attacks/cache-probing/",
"title": "Cache Probing"
}
}
</pre>
<h2 id="intro">Introduction</h2>
<em>This section is non-normative.</em>
Related Website Sets (“RWS”) provides a framework for developers to declare relationships among sites, to enable limited cross-site cookie access for specific, user-facing purposes. This is facilitated through the use of the [[STORAGE-ACCESS]].
This document defines how user agents should integrate with the [[RWS-LIST]]. For a canonical reference of the structure of the RWS list and technical validations that are run at time of submission, please see the [[SUBMISSION-GUIDELINES]].
<h2 id="infra">Infrastructure</h2>
This specification depends on the Infra standard. [[!INFRA index]]
<h2 id="list-consumption">List consumption</h2>
User agents should consume the canonical [[RWS-LIST]] on a regular basis (e.g., every 2 weeks) and ship it to individual clients (e.g. a browser application) as an updateable component.
ISSUE(wicg/first-party-sets#122): Can we make a recommendation for an update interval here?
Individual clients must [=build the list of related website sets=] on restart, or on start-up, if newly downloaded. Clients must not re-build the list at any other point in time.
The RWS list is a [=UTF-8=] encoded file containing contents parseable as a JSON object, conforming to the [[JSON-SCHEMA index]] described in the [[SUBMISSION-GUIDELINES]].
Note: Conformance to the schema is validated at submission time. Hence, it is not required for the user agent to validate conformance again on the client. The algorithms in this specification describe how user agents should parse the RWS list, and when a particular set should be considered valid from the client’s perspective.
ISSUE(wicg/first-party-sets#125): Client-side validation may be needed in cases where the [[PSL]] version differs between the server and client.
To <dfn>build the list of related website sets</dfn> from a JSON [=byte sequence=] |bytes|, the user agent should run the following steps:
1. Let |json| be the result of [=parse JSON bytes to an infra value|parsing JSON bytes to an infra value=] with |bytes|.
2. If |json| is a parsing exception, or if |json| is not an [=ordered map=], or if |json|[“sets”] does not exist, return and optionally retry fetching the list, or perform other error recovery tasks.
3. For each |entry| of |json|:
1. Let |set| be a [=related website set=].
2. If |entry|[“primary”] does not exist, continue.
ISSUE(wicg/first-party-sets#126): The specification currently suggests skipping invalid sets (missing primary entries) instead of rejecting the entire list. However, there may be benefit in a full rejection given that the server is expected to hold valid information at all times.
3. Set |set|’s [=related website set/primary=] to |entry|[“primary”].
4. Let |ccTLDs| be the result of [=parse an equivalence map|parsing an equivalence map=] from |entry|[“ccTLDs”]. If |ccTLDs| is failure, continue.
5. Set |set|’s ccTLDs to |ccTLDs|.
6. Let |serviceSites| be the result of [=parse a subset|parsing a subset=] from |entry|[“serviceSites”]. If the result is failure, continue.
7. Set |set|’s [=related website set/serviceSites=] to |serviceSites|.
8. Let |associatedSites| be the result of [=parse a subset|parsing a subset=] from |entry|[“associatedSites”]. If the result is failure, continue.
9. Set |set|’s [=related website set/associatedSites=] to |associatedSites|.
10. Add |set| to the user agent’s [=list of related website sets=].
User agents may opt to pre-process the list into a different format before delivery to the client, e.g. for optimization reasons, as long as they ensure that the client will eventually hold a valid [=list of related website sets=] as defined in this specification.
<h2 id="data-structures">Data Structures</h2>
The user agent maintains a global <dfn export>list of related website sets</dfn>, which is a [=/list=] of [=related website sets=].
A <dfn export>related website set</dfn> is a [=/struct=] with the following items:
<dfn for="related website set">primary</dfn>: A [=site=] that represents the set’s primary domain.
<dfn for="related website set">ccTLDs</dfn>: An [=equivalence map=], representing the set’s equivalent country-code top level domains that were specified by the submitter.
<dfn for="related website set">associatedSites</dfn>: A [=/list=] of [=sites=] in the associated subset.
<dfn for="related website set">serviceSites</dfn>: A [=/list=] of [=sites=] in the service subset.
Note: For additional context on the meaning of these fields please refer to the [[SUBMISSION-GUIDELINES]].
An <dfn>equivalence map</dfn> is an [=ordered map=] from [=/sites=] to [=/lists=] of [=sites=].
To <dfn>parse and validate a site</dfn> from a [=string=] |input|, run the following steps:
1. Let |url| be the result of [=basic url parser|basic URL parsing=] |input|. If the result is failure, return failure.
2. If |url|’s [=url/scheme=] is not "`https`", return failure.
3. Let |site| be the result of [=obtaining a site=] from |url|’s [=url/origin=].
4. Return |site|.
To <dfn>parse a subset</dfn> from a [=/list=] |input|, run the following steps:
1. Let |list| be an empty [=/list=].
2. For each |item| of |input|:
1. Let |site| be the result of [=parse and validate a site|parsing and validating a site=] from |item|.
2. If |site| is failure, return failure.
3. Add |site| to |list|.
3. Return |list|.
To <dfn>parse an equivalence map</dfn> from an ordered map |input|, run the following steps:
4. Let |map| be an empty [=equivalence map=].
5. For each |key| → |value| of |input|:
4. Let |keySite| be the result of [=parse and validate a site|parsing and validating a site=] from |key|. If the result is failure, return failure.
5. Let |equivalents| be an empty list.
6. For each |equivalent| in |value|:
1. Let |equivalentSite| be the result of [=parse and validate a site|parsing and validating a site=] from |equivalent|. If the result is failure, return failure.
2. Add |equivalentSite| to |equivalents|.
7. Set |map|[|keySite|] to |equivalents|.
6. Return |map|.
<h2 id="validating-inclusion">Validating related website set inclusion</h2>
Under RWS, a [=site=] |site1| is considered <dfn>equivalent to</dfn> another [=site=] |site2| given an [=equivalence map=] |equivalents|, if |equivalents|[|site1|] contains |site2| or |equivalents|[|site2|] contains |site1|.
ISSUE(wicg/first-party-sets#123): Should this be renamed to avoid being confused with a mathematical equivalence relation?
To <dfn export>determine the member type</dfn> of a given [=site=] |site| in a given [=related website set=] |set|, run the following steps:
1. If |site| is [=equivalent to=] |set|’s [=related website set/primary=] given |set|’s [=related website set/ccTLDs=], return “primary”.
2. For each |associatedSite| of |set|'s [=related website set/associatedSites=]:
1. If |site| is [=equivalent to=] |associatedSite| given |set|’s [=related website set/ccTLDs=], return “associated”.
3. For each |serviceSite| of |set|'s [=related website set/serviceSites=]:
2. If |site| is [=equivalent to=] |serviceSite| given |set|’s [=related website set/ccTLDs=], return “service”.
4. Return “none”.
To <dfn export>find a related website set</dfn> for a given [=site=] |site|, run the following steps:
1. For each |set| of the user agent’s [=list of related website sets=]:
1. Let |type| be the [=determine the member type|member type=] of |site| in |set|.
2. If |type| is not “none”, return |set|.
2. Return null.
Note: The [[SUBMISSION-GUIDELINES]] require that each site can only appear in at most one Related Website set, which is validated at submission time. For this reason, user agents do not need to be concerned with the order of the list of related website sets when performing these steps.
Define the <dfn>limit for associated sites</dfn> within a single [=related website set=] to be an [=implementation-defined=] value, which is recommended to be 3.
Note: This limit is used when [=determine eligibility for an associated site|determining eligibility for an associated site=] to only consider the sites listed at the top of the associated subset. It is meant to discourage abuse and help users and user agents understand why a particular related website set needs to exist. User agents may choose a different number based on this goal.
A [=site=] |embeddedSite| is <dfn export>eligible for same-party membership when embedded within</dfn> a [=site=] |topLevelSite|, if the following steps return true:
1. Let |set| be the result of [=find a related website set|finding a related website set=] for |topLevelSite|.
2. If |set| is null, return false.
3. Let |topLevelType| be the [=determine the member type|member type=] of |topLevelSite| in |set|.
4. If |topLevelType| is “associated” and the result of [=determine eligibility for an associated site|determining eligibility for an associated site=] given |topLevelSite| and |set| is false, return false.
5. If |topLevelType| is “service”, return false.
6. Let |type| be the [=determine the member type|member type=] of |embeddedSite| in |set|.
7. If |type| is “none”, return false.
8. If |type| is “associated”, return the result of [=determine eligibility for an associated site|determining eligibility for an associated site=] given |embeddedSite| and |set|.
9. Return true.
To <dfn>determine eligibility for an associated site</dfn> given a [=site=] |site| and a [=related website set=] |set|, run the following steps:
1. If |set|’s [=related website set/associatedSites=] does not contain |site|, return false.
2. Let |index| be the index of |site| in |set|’s [=related website set/associatedSites=].
3. If |index| is greater than or equal to the [=limit for associated sites=], return false.
4. Return true.
A given [=environment settings object=] |settings| <dfn>is same-party with its top-level embedder</dfn>, if the following steps return true:
1. Let |topLevelSite| be the result of [=obtain a site|obtaining a site=] from |settings|' [=environment/top-level origin=].
1. Let |embeddedSite| be the result of [=obtain a site|obtaining a site=] from |settings|' [=environment settings object/origin=].
1. Return whether |embeddedSite| is [=eligible for same-party membership when embedded within=] |topLevelSite|.
A given [=environment settings object=] |settings| and [=/origin=] |origin| <dfn>are same-party in an embedding context</dfn>, if the following steps return true:
1. Let |topLevelSite| be the result of [=obtain a site|obtaining a site=] from |settings|' [=environment/top-level origin=].
1. Let |embeddedSite| be the result of [=obtain a site|obtaining a site=] from |origin|.
1. Return whether |embeddedSite| is [=eligible for same-party membership when embedded within=] |topLevelSite|.
<h2 id="storage-access-integration">Integration with the Storage Access API</h2>
Modify {{Document/requestStorageAccess()}} to insert the following steps before step 13.5 (i.e. before [=requesting permission to use=]):
1. Let |settings| be <var ignore>doc</var>'s [=relevant settings object=].
1. If |settings| [=is same-party with its top-level embedder=], the user agent may run <var ignore>process permission state</var> with [=permission/granted=] and abort the remaining steps.
Modify {{Document/requestStorageAccessFor(requestedOrigin)}} to insert the following steps before step 13.8 (i.e. before [=requesting permission to use=]):
1. Let |settings| be <var ignore>doc</var>'s [=relevant settings object=].
1. If |settings| and <var ignore>requestedOrigin</var> [=are same-party in an embedding context=], the user agent may [=queue a global task=] on the [=permissions task source=] given <var ignore>global</var> to [=resolve=] <var ignore>p</var> and abort the remaining steps.
<h2 id="handling-changes">Handling related website set changes</h2>
When a [=site=] |site| leaves a [=related website set=] as the result of building a new [=list of related website sets=], user agents must ensure that it does not retain any access to data or shared identifiers held by other sites in the related website set by running the following steps:
1. Assert that |site| is not an [=opaque origin=].
2. Let |domain| be site’s [=origin/host=].
3. For each |origin| known to the user agent whose [=origin/host=]'s [=registered domain=] is |domain|:
1. [=Clear cache for origin=].
2. [=Clear cookies for origin=].
3. [=Clear DOM-accessible storage for origin=].
4. Let |descriptor| be a newly-created {{PermissionDescriptor}} with {{PermissionDescriptor/name}} initialized to “storage-access”.
5. [=Remove a permission store entry|Remove all permission store entries=] for |descriptor|, where key[0] is |site| or key[1] is |origin|.
6. Run additional [=implementation-defined=] steps to ensure that any web-accessible storage is removed from |origin|.
ISSUE(wicg/first-party-sets#124): This section should provide more details on how user agents can figure out when a site leaves an RWS.
<h2 id="privacy-consideration">Privacy Considerations</h2>
<h3 id="provide-transparency">Provide user transparency and control</h3>
A user agent that uses RWS to infer the relationship between two sites should ensure that its users are informed about this user agent choice and give users the opportunity to view and control choices made by the user agent.
<h3 id="ensure-compatibility">Ensure compatibility with non-RWS environments</h3>
Some user agents may choose not to support RWS in specific environments (such as Private Browsing Modes), or at all. All user agents and specifications should be mindful of this in their own API integrations and aim to gracefully fall back to a working solution for users and developers.
For providing access to cross-site cookies, this specification aims to ensure compatibility with non-RWS environments through usage of the [[STORAGE-ACCESS]], which provides developers an interface to handle rejections to the request and gives user agents flexibility to employ mechanisms such as prompts or heuristics as an alternative to RWS.
<h3 id="prevent-leaks">Prevent privacy leaks from list changes</h3>
Developers may submit changes to their sets to add or remove sites. Since membership in a set could provide access to cross-site cookies via automatic grants of the [[STORAGE-ACCESS]], we need to pay attention to these transitions so that they don’t link user identities across all the RWSs they’ve historically been in. In particular, we must ensure that a domain cannot transfer a user identifier from one Related Website Set to another when it changes its set membership. While a set member may not always request and be granted access to cross-site cookies, for the sake of simplicity of handling set transitions, we propose to treat such access as always granted.
For this reason, this specification requires user agents to clear any site data and storage-access permissions of a given site when a site is removed from a set, before starting any fetches that rely on those permissions or site data.
Note: Most fetches do not depend on data that needs to be cleared, so user agents are advised to optimize for request latency.
<h2 id="security-considerations">Security Considerations</h2>
<h3 id="avoid-weakening-boundaries">Avoid weakening new and existing security boundaries</h3>
Changes to the web platform that tighten boundaries for increased privacy often have positive effects on security as well. For example, cache partitioning restricts [[CACHE-PROBING]] attacks and third-party cookie blocking makes it much harder to perform [[CSRF]] by default. Where user agents intend to use Related Website Sets to replace or extend existing boundaries based on site or origin on the web, it is important to consider not only the effects on privacy, but also on security.
Sites in a common RWS may have greatly varying security requirements, for example, a set could contain a site storing user credentials and another hosting untrusted user data. Even within the same set, sites still rely on cross-site and cross-origin restrictions to stay in control of data exposure. Within reason, it should not be possible for a compromised site in an RWS to affect the integrity of other sites in the set.
This consideration will always involve a necessary trade-off between gains like performance or interoperability and risks for users and sites. User agents should facilitate additional mechanisms such as a per-origin opt-in or opt-out to manage this trade-off.
<h2 id="acknowledgements" class="no-num">Acknowledgements</h2>
Other members of the W3C Privacy Community Group had previously suggested the use of Storage Access API, or an equivalent API; in place of SameParty cookies. Thanks to @jdcauley ([1](https://github.com/WICG/first-party-sets/issues/14#issuecomment-785144990)), @arthuredelstein ([2](https://github.com/WICG/first-party-sets/issues/42)), and @johnwilander ([3](https://lists.w3.org/Archives/Public/public-privacycg/2022Jun/0001.html)).
Browser vendors, web developers, and members of the web community provided valuable feedback during this proposal's incubation in the W3C Privacy Community Group.
This proposal includes significant contributions from previous co-editors, David Benjamin, and Harneet Sidhana.
We are also grateful for contributions from Chris Fredrickson and Shuran Huang.