Generalize ScriptTransform constructor to allow main-thread processing #89

alvestrand · 2021-03-19T11:36:59Z

The RTCRtpScriptTransform constructor takes a Worker argument, limiting the usage of this form of the transform to Workers.

The older createEncodedStreams() function was agnostic as to where the processing was going to take place; a number of existing demos and apps have been written that do processing on the main thread; some have even prototyped both worker-based and main-thread-based processing and deliberately chosen main-thread-based processing.

The normal use case should be worker - but other use cases should be possible.

Proposal: Change the argument type of the constructor from Worker to (Worker or MessagePort). Dispatch the event (which could then be a message) on either the worker's implicit port or the explicit MessagePort.

This allows all the use cases that the older API allowed, but ensures that the simplest code will be the one invoking a Worker.

alvestrand · 2021-03-19T11:39:43Z

Adopting this change would remove my objection to landing #64.

youennf · 2021-03-19T13:00:14Z

This allows all the use cases that the older API allowed

Can you give details on the applications/JavaScript code that cannot be migrated to RTCRtpScriptTransform?

youennf · 2021-03-19T13:01:35Z

Also, MessagePort can potentially be used for out-of-process message exchanges, for instance between window and service worker, or opener/opened-window.
DedicatedWorker was used here to limit the complexity of the implementation while retaining most of the functionality.

alvestrand · 2021-03-25T09:51:46Z

I first wanted to suggest Window as the alternate target, but Window's PostMessage() seems to have a different signature than workers' PostMessage (which is derived from MessagePort); I wanted to have a single API to relate to.

alvestrand · 2021-05-12T11:29:18Z

Given that streams are transferable, once we let them out, we don't really have any way to limit how much they spread by PostMessage(), so I don't see much implementation simplification by this restriction.

It seems useful to have only one mechanism for interprocess communication (PostMessage), not two (PostMessage and the special-case internal processing that creates an event).

This variant was also proposed as an API in the January interim (slide 20, https://docs.google.com/presentation/d/1crumgYj4eHkjo04faLktPTg0QoYJhTFoosEBudfJBuw/edit#slide=id.g7957cd038b_9_116), but wasn't further explored in the February interim (slide 34, https://docs.google.com/presentation/d/1-ZB5pjq0hTl4iNxGhIc3polH9wjCps2Uzqg7D4v1n14/edit#slide=id.gbdce051084_0_3)

The discussion in the linked issue ( #48 ) does not give any pro/con discussion of event versus message.

youennf · 2021-05-12T12:06:42Z

so I don't see much implementation simplification by this restriction.

By restricting to Worker, we do not open the door to out-of-process transforms, this simplify implementations a great deal.
This also makes using the API straightforward to understand and use (in-process, out-of-main-thread).

alvestrand · 2021-05-18T13:06:26Z

Still don't understand: How, with transferable streams, does restricting to Worker close the door on out-of-process transforms?

I am not necessarily saying that closing the door on out-of-process transforms is a Bad Thing, but I don't understand how the stated tool ("restricting to Worker") accomplishes the task ("do not open the door to out-of-process transforms").

alvestrand · 2021-05-18T13:12:07Z

Consider reimplementing https://webrtc.github.io/samples/src/content/insertable-streams/video-analyzer/ (the obvious example of an app where background processing doesn't matter) under the event-sending interface:

One can write a worker that receives the stream on an event, and then transfers the stream right back to the main thread, using an API that is part of the Streams API and not at all part of the webrtc API.
There is no simplification of implementation compared to a more generic, more common IPC API using PostMessage with a MessagePort target.

youennf · 2021-05-19T08:15:48Z

The MessagePort code path is much more complex than Worker.postMessage code path.
We keep things simple for WebRTC encoded transform implementation by staying in process and we support all cases we actually care.

Additional API (transferable streams) can be used for more complex cases on top of webrtc encoded transform.
That is great, this is composability a la Unix.

This does not mean we should bake into the simple WebRTC encoded transform API everything that WebRTC encoded transform+transferable streams allow.

youennf · 2021-05-19T08:22:02Z

If the goal is to be able to properly shim createEncodedTransform in JS, I think the following can be written:

Create a ReadableStream R1 and WritableStream W1 in window environment with JS sources SR1 and SW1
Create a worker and a script transform
Set the transform on the sender/receiver
In the worker, when receiving the readable/writable R2/W2, transfer them back to window (async operation).
When receiving R2 and W2 in window, use them to make SR1 and SW1 functional (async operation)
Return R1 and W1

alvestrand · 2021-05-19T09:19:26Z

Re #89 (comment) about complexity: Why is the MessagePort code path more complex than Worker.postMessage? For the user or for the browser implementor?
After all, DedicatedWorker is defined as "act as if it has an internal MessagePort" (https://html.spec.whatwg.org/multipage/workers.html#dedicated-workers-and-the-dedicatedworkerglobalscope-interface), so I assumed that the underlying implementation was OOM the same, since the API is identical.

Re workaround: Yes, that's exactly what I was suggesting in #89 (comment), and it shows that:

Requiring posting to a worker does not lead to decreased implementation complexity, since one still has to support processing the stream if it happens to be executing on the main thread (or some other thread)
Requiring posting to a worker increases the complexity for the user in the (few) cases where main thread processing makes sense.

youennf · 2021-05-19T09:39:53Z

Why is the MessagePort code path more complex than Worker.postMessage? For the user or for the browser implementor?

Both actually.
For the implementor, posting to a worker is easy as it is a destination that does not move (the implicit port is not transferred). Posting to a general MessagePort is harder since it might be transferred at will. In Safari, there is usually a central registry that does the routing and message management. Not sure about other implementations, let me know how Chrome is handling this.

As of user impact, using a MessagePort in the API will require the user to create a worker, create a MessageChannel, transfer one of the message port to the worker and use the other one in the script transform constructor.
For the user, it seems simpler and easier to understand to directly pass the Worker as a parameter, which covers all cases we know about.

Requiring posting to a worker does not lead to decreased implementation complexity, since one still has to support processing the stream if it happens to be executing on the main thread

Only if the UA supports transferable streams, which can be implemented in a fully orthogonal way. This orthogonality is decreasing complexity.
The only case where it would not decrease complexity would be if UA would want to optimise the script transform+transfer stream code path. I haven't heard of use cases that would make this needed.

If there are such use cases, I would first try to see how we could update the API and resort on specific optimization if API change is not feasible.

2. Requiring posting to a worker increases the complexity for the user in the (few) cases where main thread processing makes sense.

I am not clear about these few cases.

alvestrand · 2021-05-19T10:13:04Z

Replying in reverse order:

The use case I use as an example is https://webrtc.github.io/samples/src/content/insertable-streams/video-analyzer/, where the whole point of the processing is to get data that is presented on the page by code running on the main thread.

I mentioned this 21 hours ago in #89 (comment) too.

Using a (MessagePort or Worker) will not lead to any user increase in API user complexity for the worker case; that's what I suggested in #89 (comment) - I have not suggested removing Worker from that construct.

alvestrand · 2021-05-19T10:19:33Z

FWIW - MessagePort in Blink is implemented as a Mojo IPC operation, and postMessage() in DedicatedWorkerGlobalScope is implemetned on top of MessagePort. So for Chrome, the underlying mechanism is identical for the two cases.
If I've read the code correctly.

youennf · 2021-05-19T11:57:33Z

The use case I use as an example is https://webrtc.github.io/samples/src/content/insertable-streams/video-analyzer/, where the whole point of the processing is to get data that is presented on the page by code running on the main thread.

I have not looked precisely at what this page is doing but I think the usecase can be defined as:

Do the video rtc pipeline as efficiently as possible (as usual). In addition, analyse the data to gather stats/monitor the pipeline
Show the stats/analyze results in the web page
I would implement it as:
Use script transform to efficiently do the processing out of main thread. Gather the stats in the worker.
Send the stats back to the page.
Based on that, the current API supports that use case and would lead to more optimal results than doing the whole processing in main thread.

MessagePort in Blink is implemented as a Mojo IPC operation

In the previous message, I was referring to Worker.postMessage, not WorkerGlobalScope.postMessage (which might be the same for shared worker, service worker and dedicated worker in Chrome).
In any case, allowing a MessagePort would lead to require Mojo IPC operations. Currently, RTCRtpScriptTransform only requires posting a task on the worker thread, which is much simpler to implement.

fippo · 2021-10-28T10:17:43Z

WPT question: who volunteers to rewrite all the tests to use workers?

vitaly-castLabs · 2022-09-28T10:16:30Z

Just my two cents. I'm working on this project involving feeding RTCEncodedVideo/AudioFrame to MSE, and it all works nicely with Chrome's createEncodedStreams. However porting it to Safari / RTCRtpScriptTransform is not exactly trivial because I don't have access to the video element and can't create a MediaSource object from a worker. So having a non-worker alternative would be really appreciated

alvestrand · 2022-09-28T10:46:56Z

If Safari's implementation is the way they proposed in the spec, the RTCRtpScriptTransform is already executing in a worker (the constructor for RTCRtpScriptTransform takes a Worker argument). But I haven't tried coding on Safari.

https://w3c.github.io/webrtc-encoded-transform/#dom-rtcrtpscripttransform-rtcrtpscripttransform

youennf · 2022-09-28T12:33:50Z

The way to go is to get MSE in a worker which is specified and has consensus from all browser vendors IIRC.
This seems a superior approach, technically speaking and timing wise. Until such support lands, I would recommend post messaging data to window manually.

alvestrand · 2022-09-29T06:09:02Z

I had an acronym cache miss on MSE....

aboba · 2022-09-29T06:28:34Z

MSEv2 (like MSEv1) is a bit idiosyncratic with respect to "low latency" mode. So if the goal is low latency and there is no need to support DRM, it might be easier (and might save some containerization cycles) to use WebCodecs for decoding instead.

MSEv2 in Workers spec: https://w3c.github.io/media-source/
"Intent to Ship": https://groups.google.com/a/chromium.org/g/blink-dev/c/FRY3F1v6Two

vitaly-castLabs · 2022-09-29T07:42:51Z

The only reason I use MSE is DRM. Are there any benefits of going through all the trouble of extracting video frames and decoding and rendering them "manually" compared to doing nothing at all and letting webrtc/browser take care of it?

aboba · 2022-09-29T14:22:01Z

Currently, WebCodecs doesn't support DRM so MSE is your only alternative.

alvestrand · 2023-11-30T07:53:00Z

Revisiting this issue .... we do have a fundamental disagreement about the need for main-thread support. But it seems to me that if we do it, we could at least agree on a reasonable interface.

Abandoning the messageport idea - how would this look?

interface RTCRtpScriptTransform : EventTarget {
constructor(Worker? worker, optional any options, optional sequence transfer);
attribute EventHandler onrtctransform;
}

With the obvioius semantics that if worker is not specified, the event is fired at the RTCRtpScriptTransform object.

Having this API shape available would (I think) make it possible to support the RTCRtpScriptTransform-based API for the use cases that are supported by the current Chrome interface, and may offer a path to converging the implementations on a single API.

alvestrand · 2023-12-04T06:15:40Z

Seeking comment on #89 (comment) from @youennf and @jan-ivar

jan-ivar · 2023-12-04T22:15:33Z

@alvestrand thanks I appreciate its illustrative power of our gap. But we can polyfill that.

Providing a polyfill shaped like you propose would take me a bit longer (making the transform a bona fide EventTarget), but here's a fiddle showing main-thread access in today's API (modulo bug 1868223):

const bouncer = new Worker(`data:text/javascript,(${bounce.toString()})()`);
function bounce() {
  onrtctransform = async ({transformer: {readable, writable}}) =>
    self.postMessage({transformer: {readable, writable}}, [readable, writable]);
}

To use this bouncer:

sender.transform = new RTCRtpScriptTransform(bouncer, {});
const event = await new Promise(r => bouncer.onmessage = r);

jan-ivar · 2023-12-04T22:23:11Z

The issue is all browsers implement the WebRTC pipeline off main-thread, yet the RTCPeerConnection API that controls it is not exposed in workers, making main-thread the path of least resistance, which can lead to poor outcomes in interoperable performance. Exposing that pipeline in a worker-first API corrects this bias, by reversing the transfer burden.

alvestrand · 2023-12-05T07:48:22Z

The issue is all browsers implement the WebRTC pipeline off main-thread, yet the RTCPeerConnection API that controls it is not exposed in workers, making main-thread the path of least resistance, which can lead to poor outcomes in interoperable performance. Exposing that pipeline in a worker-first API corrects this bias, by reversing the transfer burden.

I don't buy that argument; making something harder because some other thing is hard (the need for RTCPeerConnection in worker has been recognized for years, but isn't being acted on) isn't acting for the benefit of the developers (far less the users).

To be clear: I'm offering this option as a means of unifying our approaches so that we can get to a single API that's supported across browsers. Making life more difficult for existing developers (who are using this feature to the tune of hundreds of thousands of instances per day) is not a feature for us.

jan-ivar · 2023-12-05T19:44:02Z

Making life more difficult for existing developers (who are using this feature to the tune of hundreds of thousands of instances per day) is not a feature for us.

Who are these developers with hundreds of thousands of instances per day, and why are they on main-thread?

Isn't this concerning, when even the Media WG acknowledged that: “There is a consensus and we agree that media processing in general should happen in a Worker context. Not all use cases require this, though, e.g., non-realtime transcoding.”?

Assuming these developers aren't doing non-realtime transcoding, would it be fair to ask why they haven't switched to workers?

A worker-first API doesn't have these problems.

youennf · 2023-12-06T12:02:54Z

FWIW, the web sites that are using this API I know of are all using workers, where the standard API is slightly simpler than Chrome's version.

we do have a fundamental disagreement about the need for main-thread support

Both Chrome API and the standard API are supporting main-thread AFAIK, the standard API is just requiring a few more lines to do main thread processing.
Is the goal to allow web developers to remove those lines, or is it something more important?
If the former, the priority of this issue seems rather low to me.

alvestrand · 2023-12-06T12:15:52Z

The reason given by the developers I and @guidou are in contact with is that the benefit of using a worker has been evaluated for their use case, and the performance / quality benefit is negligible or negative.

Naturally, our closest contacts are with the Google Meet folks.

youennf · 2023-12-06T16:36:38Z

negligible or negative.

Negligible is fine, negative is very different.
I would be glad to further dive in any performance issue that Google Meet folks might be facing when adopting Safari WebRTC encoded transform support.

FWIW, WebKit used to have RTC networking hitting main thread and we got evidence this was causing issues to actual users. Using a WebRTC encoded transform on main thread would typically reintroduce these issues that Safari users encountered. Why should we be doing this?

WebRTC engines like libwebrtc typically try to stay off the main thread in their networking path.
It makes sense to apply the same design principles to WebRTC encoded transforms.

jan-ivar · 2023-12-06T17:26:11Z

I would be glad to further dive in any performance issue that Google Meet folks might be facing when adopting Safari WebRTC encoded transform support.

I'd like to extend the same help to Google Meet folks with adopting Firefox's support of the same API (should they run into any differences from Safari).

Mozilla's findings remain that main-thread is a poor environment for realtime code due to the risk of jank. Workers leading to worse quality or worse performance contradict this and would be new information indeed.

guidou · 2023-12-06T22:25:55Z

negligible or negative.

Negligible is fine, negative is very different. I would be glad to further dive in any performance issue that Google Meet folks might be facing when adopting Safari WebRTC encoded transform support.

The negligible part was about performance. The negative part was the extra complexity/maintenance burden of introducing a worker in their architecture for no visible gain in performance.

jan-ivar · 2023-12-16T02:58:53Z

I'm glad to hear there was no negative performance impact.

The negative part was the extra complexity/maintenance burden of introducing a worker in their architecture ...

The spec API simplifies workers.

The use case I use as an example is https://webrtc.github.io/samples/src/content/insertable-streams/video-analyzer/, where the whole point of the processing is to get data that is presented on the page by code running on the main thread.

I've updated webrtc/samples#1646 to use the spec. It's only a few lines of difference (moving code to worker), and it's more performant, since there was no need to move processing to main thread in the first place.

Here's a working demo that works in all browsers. Hopefully this is helpful for those needing help with workers.

lgrahl · 2024-03-16T00:37:27Z

Hi everyone, long time no see.

If I understand correctly, forwarding the stream from the Worker to some other MessagePort (where ever that ends up) is allowed, so where is the benefit of making it harder for developers to do just that? There are complex Worker scenarios that would benefit from not receiving the stream at a Worker-global entrypoint.

I can understand that you want to incentivise the use of Workers in this API (although I'm torn on the chosen method) but I believe allowing for MessagePort in the constructor could be a sweet spot and would also allow main thread processing in a "clunky but not so clunky" way.

jan-ivar · 2024-03-18T17:15:32Z

If I understand correctly, forwarding the stream from the Worker to some other MessagePort (where ever that ends up) is allowed, so where is the benefit of making it harder for developers to do just that?

Things already being possible in JS generally counts against exposing new APIs, not for it.

Also, just because something is possible doesn't make it desirable or without cost. Transferable streams produce a "readable side in another realm" where the original realm feeding that readable in this case is the dedicated worker provided in the constructor.

There are complex Worker scenarios that would benefit from not receiving the stream at a Worker-global entrypoint.

What benefit? It seems hard to do anything in a worker without a Worker-global entrypoint e.g. self.onmessage.

I can understand that you want to incentivise the use of Workers in this API (although I'm torn on the chosen method) but I believe allowing for MessagePort in the constructor could be a sweet spot and would also allow main thread processing in a "clunky but not so clunky" way.

The MessagePort idea was abandoned up-thread so I won't rehash the arguments against it.

But a goal of the API is to avoid implicit main-thread, which requires a worker argument, and a MessagePort (another tunneling concept) is no substitute. whatwg/streams#1124 is selling this idea that it doesn't matter where readables are initially exposed, which is how we got into this mess, but it does, unless we want to have to rely on UA optimizations that break semantics.

jan-ivar · 2024-03-18T19:30:31Z

There are complex Worker scenarios that would benefit from not receiving the stream at a Worker-global entrypoint.

Doesn't the combination of event listener architecture and the constructor's options argument help decentralize things?

// main.js
sender.transform = new RTCRtpScriptTransform(worker, {message: "myApp123"});

// worker.js
class MyApp123 {
  constructor() {
    self.addEventListener("rtctransform", ({transformer}) => {
      if (transformer.options.message != "myApp123") return;
      this.transformer = transformer;
    });
  }
}

lgrahl · 2024-03-19T10:36:28Z

Okay, I think I understand the problem a bit better now you're tackling here. And your suggestion to attach and then immediately detach from the 'rtctransform' event when the matching transformer is received would work for the use cases I can think of. Thanks @jan-ivar!

jan-ivar · 2024-07-10T16:00:18Z

The TAG clarified its design principle last year, which I think applies here:

§ 10.2.1. Some APIs should only be exposed to dedicated workers

With that I propose we can close this issue.

A chromium shim to handle backwards compatibility exists in webrtcHacks/adapter#1145 and was demonstrated to work in https://blog.mozilla.org/webrtc/end-to-end-encrypt-webrtc-in-all-browsers. I'm open to improving it as needed.

fippo · 2024-07-11T04:10:04Z

I have clearly stated why I believe this is a harmful polyfill in that PR as well as https://webrtchacks.com/end-to-end-encryption-in-webrtc-4-years-later/. @jan-ivar you are free to publish your own shim and convince developers to adopt it

I will continue to abstain from this working group while "pointing fingers" at other participants as exemplified by this post is considered acceptable.

alvestrand changed the title ~~Generalize ScriptTransform constructor~~ Generalize ScriptTransform constructor to allow main-thread processing Jul 22, 2021

alvestrand mentioned this issue Dec 8, 2021

Guidance about exposing on Window vs Worker only (and other contexts) w3ctag/design-principles#325

Closed

alvestrand mentioned this issue Aug 24, 2023

Add description of an API for controlling SDP codec negotiation #186

Open

alvestrand mentioned this issue Sep 5, 2023

Update insertable streams section to WebRTC Encoded transforms webrtc/samples#1620

Open

4 tasks

foolip mentioned this issue Dec 1, 2023

WebRTC “end-to-end-encryption” web-platform-tests/interop#533

Closed

3 tasks

This was referenced Dec 18, 2023

Shim RTCRtpScriptTransform. webrtcHacks/adapter#1145

Open

Rename RTCRtpScriptTransform object #216

Open

Generalize ScriptTransform constructor to allow main-thread processing #89

Generalize ScriptTransform constructor to allow main-thread processing #89

Comments

alvestrand commented Mar 19, 2021

alvestrand commented Mar 19, 2021

youennf commented Mar 19, 2021

youennf commented Mar 19, 2021

alvestrand commented Mar 25, 2021

alvestrand commented May 12, 2021 • edited Loading

youennf commented May 12, 2021

alvestrand commented May 18, 2021

alvestrand commented May 18, 2021

youennf commented May 19, 2021

youennf commented May 19, 2021

alvestrand commented May 19, 2021

youennf commented May 19, 2021

alvestrand commented May 19, 2021

alvestrand commented May 19, 2021

youennf commented May 19, 2021

fippo commented Oct 28, 2021

vitaly-castLabs commented Sep 28, 2022

alvestrand commented Sep 28, 2022

youennf commented Sep 28, 2022

alvestrand commented Sep 29, 2022

aboba commented Sep 29, 2022

vitaly-castLabs commented Sep 29, 2022

aboba commented Sep 29, 2022

alvestrand commented Nov 30, 2023

alvestrand commented Dec 4, 2023

jan-ivar commented Dec 4, 2023 • edited Loading

jan-ivar commented Dec 4, 2023

alvestrand commented Dec 5, 2023

jan-ivar commented Dec 5, 2023 • edited Loading

youennf commented Dec 6, 2023

alvestrand commented Dec 6, 2023

youennf commented Dec 6, 2023

jan-ivar commented Dec 6, 2023

guidou commented Dec 6, 2023

jan-ivar commented Dec 16, 2023

lgrahl commented Mar 16, 2024 • edited Loading

jan-ivar commented Mar 18, 2024

jan-ivar commented Mar 18, 2024

lgrahl commented Mar 19, 2024

jan-ivar commented Jul 10, 2024

fippo commented Jul 11, 2024 • edited Loading

alvestrand commented May 12, 2021 •

edited

Loading

jan-ivar commented Dec 4, 2023 •

edited

Loading

jan-ivar commented Dec 5, 2023 •

edited

Loading

lgrahl commented Mar 16, 2024 •

edited

Loading

fippo commented Jul 11, 2024 •

edited

Loading