Require CORS for HLS and DASH media formats #6468

annevk · 2021-03-09T13:02:35Z

We've had a bit of a discussion exploring manifest-based media formats in annevk/orb#23. In particular, both HLS and DASH are resources that effectively list a bunch of other resources to fetch that when composed result in media. I think the effective conclusions of that thread are as follows:

If the media element gets an "opaque" response that's HLS or DASH, treat it as a network error. This shouldn't really be possible with (C)ORB so maybe it can be an assert, except that https://tools.ietf.org/html/rfc8216#section-4 suggests HLS might do path extension matching(!). So I think it's possible for a resource to somehow make its way through https://github.com/annevk/orb and get identified as HLS, if implementations indeed do such a thing. There's such a path for DASH as well, but I think we should add application/dash+xml to the (C)ORB blocklist and if media elements require a MIME type for DASH it should then not be possible to get such an "opaque" response.
Furthermore, we should require that any fetches that HLS and DASH resources want user agent to make use "cors" as request mode.

cc @whatwg/media @acolwell

The text was updated successfully, but these errors were encountered:

These MIME types cannot be fetched without CORS. Closes #20 and closes #23. Follow-up: whatwg/html#6468.

sandersdan · 2021-10-20T20:38:41Z

Just some quick notes on Chromium Android's current HLS implementation for <video>:

We fetch the manifest (following redirects) once before handing off to the platform player. The mode depends on the crossorigin attribute but typically will be no-cors.
The platform player does its own content detection which does involve matching path components (not just extensions!). This is best (as in most safely) treated as an additional restriction rather than the primary detection method.

annevk · 2021-10-21T05:45:02Z

@sandandsnow I was led to believe there is a same-origin restriction on Android due to annevk/orb#23 (comment) by @acolwell. Is that not correct?

sandersdan · 2021-10-21T18:12:44Z

It is not correct. We set android-allow-cross-domain-redirect: 0 on the platform player, but this is after fetching and following redirects the first time. There is no origin restriction on the initial manifest fetch.

annevk · 2021-10-22T05:37:10Z

So what does that mean exactly? Say the document is on origin A and the manifest is on origin B. Does that work? What origin can the resources listed in the manifest be from?

sandersdan · 2021-10-25T18:38:25Z

That should work, and the resources must be on origin B.

It's worth noting explicitly that this is an incomplete implementation. A feature-complete HLS implementation would presumably support the full set of cross-origin media features and behaviors.

annevk · 2021-10-27T13:24:02Z

Okay, @anforowicz's https://crrev.com/c/3202583 has more context.

As you said this is a problem for Android/iOS (and for Safari on macOS) as per https://en.wikipedia.org/wiki/HTTP_Live_Streaming#Clients.

The same-origin-with-the-HLS-manifest limitation makes this hard to abuse for attackers. They would have to inject an HLS manifest into the origin of which they want to attack resources. Given the extremely lax resource type checking this is made easier, but it's still non-trivial.

For https://github.com/annevk/orb this presents a challenge. Thoughts:

We cannot blocklist "application/vnd.apple.mpegurl" or "audio/mpegurl". I.e., part of Block manifest-based media and WebVTT annevk/orb#26 needs to be reverted.
The ORB algorithm would have to be able to both detect and parse HLS in order to obtain the URLs the media element will want to fetch. It will need to allow responses to requests for those URLs to bypass ORB (similar to how certain range responses get a pass). It's not clear to me how well HLS detection and parsing are defined. (In the case where implementations hand of HLS to something else that they blindly trust with parsing and fetching we would only have to be able to detect HLS. It sounds like this might be the case in current HLS implementations of Chrome and Firefox at least.)

Both of these we could make conditional upon HLS support to reduce the attack surface when it's not supported.

@anforowicz @sandersdan what do you think?

sandersdan · 2021-10-27T19:02:26Z

A nit: there is also application/x-mpegurl and audio/x-mpegurl.

Parsing the manifests seems a bit extreme to me. The manifest resources are going to be one of (A) more manifests, (B) MPEG2 TS, or (C) MP4. I expect all of these would be treated as media resources and therefore allowed with no special handling.

I don't know offhand whether there are any issues with inaccurate content types on these resources.

Making the logic conditional on UA support seems reasonable, from the above list Chromium Android should be able to block MPEG TS since it's not supported by the current <video> implementation.

anforowicz · 2021-10-27T20:24:52Z

I agree with #6468 (comment) that ORB cannot blocklist HLS MIME types. I think we should go even one step further and allowlist these MIME types (similarily to how image/svg+xml is explicitly included in the list of "opaque-safelisted MIME types").

I am not sure if we need to worry about requests (including range requests) for actual video resources covered by the manifest. AFAIU, these subsequent/non-manifest requests do not go through Chrome's network stack (they are handled by an external software component on Android + they need to be implemented via polyfill / fetch(...) on destkop). Therefore blocking such subsequent/non-manifest requests in ORB sounds okay (this would effectively CORS for these requests, but AFAIU this requirement is already present for Chrome on desktop). In theory ORB can avoid blocking these kinds of requests because ORB can sniff and recognize video resources - this doesn't work because (I assume) the resources behind HLS manifests can be consumed from the middle (via range requests) making them challenging to sniff.

@sandersdan, do you think it would be okay if:

HLS manifests would be allowed by ORB
Requests for videos referenced from HLS manifests would be blocked by ORB (i.e. blocked in no-cors mode; ORB does not apply to CORS requests).

sandersdan · 2021-10-27T20:47:19Z

That works fine for Chromium Android's current implementation. I can't guarantee that Chromium Android will never include its own HLS player implementation, though.

jernoble · 2021-10-27T22:03:53Z

FWIW,

Parsing the manifests seems a bit extreme to me. The manifest resources are going to be one of (A) more manifests, (B) MPEG2 TS, or (C) MP4.

(D) WebVTT files: https://datatracker.ietf.org/doc/html/draft-pantos-hls-rfc8216bis#section-3.1.4

anforowicz · 2021-10-27T22:09:19Z

Stepping back, the main problem is that ORB wants to avoid blocking video resources (which are allowed in no-cors mode), but might have trouble identifying if a given range request response contains a video fragment.

For non-HLS video, ORB assumes that the very first request will fetch the start (i.e. first bytes) of a video resource - this allows ORB to recognize the response as video (based on audio or video type pattern sniffing) and remember the URL (allowing future range requests). This approach has been initially proposed in whatwg/fetch#721 (comment)

For HLS video, it seems that the very first request might already be for the middle of a video request (and when working on ORB we have so far assumed that such middle-of-a-request responses cannot be sniffed):

One way to ensure that ORB doesn't block such responses is to parse the HLS manifest and allowlist the URLs of the referenced video resources (i.e. having ORB mark these URLs as "surely this is a video or webvtt").
Another way might be to sniff - @sandersdan has pointed out in an offline conversation that maybe the very first request (even if it is a range request) might also be sniffable, because HLS is structured with segments, each fetched from start to end, and they may be easy to detect from the first few bytes.
- I don't know if this kind of sniffing would be covered by the the audio or video sniffing spec referenced by ORB. Sniffing of WebVTT is definitely not covered, but the link from Require CORS for HLS and DASH media formats #6468 (comment) says that each WebVTT Segment must either start with a WebVTT header or have an EXT-X-MAP tag applied to it (making range request responses sniffable?).
- I don't know how robust this approach would be in practice (i.e. if it would be reasonable that all range requests align wht the start of a segment).

annevk · 2021-10-28T10:55:11Z

@anforowicz while it works for Chromium, I'm not entirely sure it works for a standard. I suppose we could state something to the effect that the HLS manifest better be handled by a separate process that can do arbitrary network fetches, but only return a media stream, but it seems somewhat sketchy. And if arbitrary byte sequences end up decoding as media this might be a way to smuggle data into the attacker process. Maybe it's good enough for v1 though.

Anyway, I agree that the initial problem is determining whether something is HLS. I think it would make sense to cover it as part of https://mimesniff.spec.whatwg.org/#audio-or-video-type-pattern-matching-algorithm as the media element implementation needs it as well. If you or @sandersdan or @jernoble could work on that, that would be great.

anforowicz · 2021-11-01T15:23:48Z

One other alternative is to ask ORB to avoid reading/parsing/sniffing HLS/DASH responses (the manifest fetch + the subsequent video / captions fetches) altogether + require such responses to use accurate/strict Content-Type response headers (allowlisting relevant MIME types in ORB via annevk/orb#20 and/or annevk/orb#3 - I think that this would include application/dash+xml [and other non video/*-prefixed manifest types], text/vtt, and the MIME type of the actual videos [hopefully all prefixed with video/*])

annevk · 2021-11-01T16:01:37Z

Given that media manifests are currently recognized by file extensions(!) and such as I understand it I somewhat doubt that would be web compatible, but I would obviously be in favor of it.

sandersdan · 2021-11-01T18:13:19Z

Given that media manifests are currently recognized by file extensions(!) and such as I understand it I somewhat doubt that would be web compatible

I share your suspicions about this; I'd be interested in eventually running an experiment to check whether Content-Type might be reliable enough.

maybe the very first request (even if it is a range request) might also be sniffable

To add some detail here, there is an EXT-X-BYTERANGE option in M3U8 that specifies request ranges for segments and is in common use.

I think it would make sense to cover it as part of https://mimesniff.spec.whatwg.org/#audio-or-video-type-pattern-matching-algorithm as the media element implementation needs it as well.

I'm not currently in a position to allocate time to this effort, but some quick observations:

M3U8 files are text and start with #EXTM3U
WebVTT files are text and start with WEBVTT
MPEG2 TS segments are binary, but they do start with a constant sync byte (0x47). It's not a lot to go on but perhaps there are a few more bits we can rely on, potentially in the payload part since the codec selection is limited.
fMP4 segments are binary, but the ISOBMFF structure provides an immediate box header that may be sniffable. The issue is that there are many box types that could be valid, although moof is very likely at the start of an fMP4. In most cases I would expect the initialization segment to be retrieved from the same URL (earlier offset), in which case it is likely to contain an ftyp box and could be detected the same way as ordinary MP4 media.

sandersdan · 2022-04-06T20:53:29Z

Just wanted to post an update based on Chrome's now in-progress HLS experiment. We still expect that manifest files will be best identified by the magic string above (#EXTM3U) as the Content-Type is not required to be accurate by the HLS specification.

We are unsure about detecting media files by sniffing, since HLS allows for sub-ranges to be requested. We're okay with using Content-Type alone to allow media files and will re-evaluate that based on in-the-field metrics. Even a relatively small fraction of blocked requests could be a problem for us on Android, but on Desktop HLS is currently unsupported so there is potentially more flexibility there.

@willcassella is handling the implementation in Chrome.

annevk · 2022-04-07T07:37:01Z

Can an HLS resource point to resources on another origin from the origin of the HLS resource itself? At least theoretically the ORB filter could, upon seeing an HLS resource, safelist the URLs mentioned inside it. (It would have to be able to parse HLS for that though.) But if those URLs can point to other origins, the HLS resource could be used for attacks.

Safelisting MIME types might be another approach, but if the HLS resource doesn't have one, why would the ranged media resources have one?

willcassella · 2022-04-07T20:10:51Z

It's totally possible (and I think likely, with interstitial ads) that an HLS playlist could refer to content on other origins.

Building an HLS url-extractor on top of the low-level bits I've already written wouldn't bee too difficult, but it still wouldn't be totally straightforward since it'll need to be able to perform variable substitution and parse attribute-lists on tags in order to accurately extract URLs.

We are unsure about detecting media files by sniffing, since HLS allows for sub-ranges to be requested.

Upon intercepting a range-request that doesn't include the sniffable portion of the resource, the network process could hypothetically generate an additional range request for the same resource that does include those bits and block the actual request on that.

annevk · 2022-04-21T12:54:56Z

It seems that if HLS allows for arbitrary origins and arbitrary ranges of resources on those origins you need to validate those resources in some manner as otherwise it's a pretty big hole in the whole setup, no? What would stop an attacker from using HLS to pull in (bits of) arbitrary resources across the web?

kevstal · 2022-04-21T19:29:31Z

(Just responding to Chris Needhams not to HLS group)
A common use case I’ve seen is full path redirects to things like ad providers
Presumably this triggers beacons in such systems etc - whilst the media playlist would include the redirect origin just pondering if the future redirect could change and a CORS check would then block maybe

kevleyski · 2022-04-21T19:38:40Z

(saying that any resource coming via a redirect from a 3rd party would be known ahead of time and CORS headers would be set up - so maybe it’s a moot point about redirect complications)

dalecurtis · 2024-04-10T14:14:09Z

@annevk did this ever reach a conclusion? Is this the plan for Safari HLS?

sefeng211 · 2024-05-02T13:57:17Z

@sandersdan Dan, I've been testing how this works in Chrome and I found something that seems unexpected.

In https://mozilla.seanfeng.dev/files/hls/, the last case works as the document is on mozilla.seanfeng.dev, manifest is on hls.seanfeng.dev, and manifest lists the resource on hls2.seanfeng.dev, and this video is playable in Chrome Android. Doesn't it violate the cross-origin restriction in Chrome?

dalecurtis · 2024-05-06T19:28:09Z

We (Chrome) are in the processing of launching our HLS implementation, but it needs to be Safari compatible. It doesn't seem Safari has chosen to restrict cross origin responses (hence my Q above to @annevk), so we're still discussing what we want to do on our side.

annevk added topic: media topic: fetch labels Mar 9, 2021

annevk mentioned this issue Mar 9, 2021

Add application/dash+xml to opaque-safelisted MIME types. annevk/orb#23

Closed

annevk mentioned this issue Sep 3, 2021

Add a way to sniff for HLS playlist files (m3u8) whatwg/mimesniff#125

Open

3 tasks

annevk added a commit to annevk/orb that referenced this issue Oct 4, 2021

Block manifest-based media and WebVTT

34cf17e

These MIME types cannot be fetched without CORS. Closes #20 and closes #23. Follow-up: whatwg/html#6468.

annevk mentioned this issue Oct 4, 2021

Block manifest-based media and WebVTT annevk/orb#26

Merged

annevk added a commit to annevk/orb that referenced this issue Oct 5, 2021

Block manifest-based media and WebVTT

9c4bd7f

These MIME types cannot be fetched without CORS. Closes #20 and closes #23. Follow-up: whatwg/html#6468.

annevk mentioned this issue Oct 27, 2021

HLS manifest is fetched across origins annevk/orb#29

Open

annevk changed the title ~~Require CORS for HSL and DASH media formats~~ Require CORS for HLS and DASH media formats May 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Require CORS for HLS and DASH media formats #6468

Require CORS for HLS and DASH media formats #6468

annevk commented Mar 9, 2021 •

edited

Loading

sandersdan commented Oct 20, 2021

annevk commented Oct 21, 2021

sandersdan commented Oct 21, 2021

annevk commented Oct 22, 2021

sandersdan commented Oct 25, 2021

annevk commented Oct 27, 2021 •

edited

Loading

sandersdan commented Oct 27, 2021

anforowicz commented Oct 27, 2021

sandersdan commented Oct 27, 2021 •

edited

Loading

jernoble commented Oct 27, 2021

anforowicz commented Oct 27, 2021 •

edited

Loading

annevk commented Oct 28, 2021

anforowicz commented Nov 1, 2021

annevk commented Nov 1, 2021

sandersdan commented Nov 1, 2021 •

edited

Loading

sandersdan commented Apr 6, 2022

annevk commented Apr 7, 2022

willcassella commented Apr 7, 2022

annevk commented Apr 21, 2022

kevstal commented Apr 21, 2022

kevleyski commented Apr 21, 2022

dalecurtis commented Apr 10, 2024

sefeng211 commented May 2, 2024

dalecurtis commented May 6, 2024

Require CORS for HLS and DASH media formats #6468

Require CORS for HLS and DASH media formats #6468

Comments

annevk commented Mar 9, 2021 • edited Loading

sandersdan commented Oct 20, 2021

annevk commented Oct 21, 2021

sandersdan commented Oct 21, 2021

annevk commented Oct 22, 2021

sandersdan commented Oct 25, 2021

annevk commented Oct 27, 2021 • edited Loading

sandersdan commented Oct 27, 2021

anforowicz commented Oct 27, 2021

sandersdan commented Oct 27, 2021 • edited Loading

jernoble commented Oct 27, 2021

anforowicz commented Oct 27, 2021 • edited Loading

annevk commented Oct 28, 2021

anforowicz commented Nov 1, 2021

annevk commented Nov 1, 2021

sandersdan commented Nov 1, 2021 • edited Loading

sandersdan commented Apr 6, 2022

annevk commented Apr 7, 2022

willcassella commented Apr 7, 2022

annevk commented Apr 21, 2022

kevstal commented Apr 21, 2022

kevleyski commented Apr 21, 2022

dalecurtis commented Apr 10, 2024

sefeng211 commented May 2, 2024

dalecurtis commented May 6, 2024

annevk commented Mar 9, 2021 •

edited

Loading

annevk commented Oct 27, 2021 •

edited

Loading

sandersdan commented Oct 27, 2021 •

edited

Loading

anforowicz commented Oct 27, 2021 •

edited

Loading

sandersdan commented Nov 1, 2021 •

edited

Loading