-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generalize ScriptTransform constructor to allow main-thread processing #89
Comments
Adopting this change would remove my objection to landing #64. |
Can you give details on the applications/JavaScript code that cannot be migrated to RTCRtpScriptTransform? |
Also, MessagePort can potentially be used for out-of-process message exchanges, for instance between window and service worker, or opener/opened-window. |
I first wanted to suggest Window as the alternate target, but Window's PostMessage() seems to have a different signature than workers' PostMessage (which is derived from MessagePort); I wanted to have a single API to relate to. |
Given that streams are transferable, once we let them out, we don't really have any way to limit how much they spread by PostMessage(), so I don't see much implementation simplification by this restriction. It seems useful to have only one mechanism for interprocess communication (PostMessage), not two (PostMessage and the special-case internal processing that creates an event). This variant was also proposed as an API in the January interim (slide 20, https://docs.google.com/presentation/d/1crumgYj4eHkjo04faLktPTg0QoYJhTFoosEBudfJBuw/edit#slide=id.g7957cd038b_9_116), but wasn't further explored in the February interim (slide 34, https://docs.google.com/presentation/d/1-ZB5pjq0hTl4iNxGhIc3polH9wjCps2Uzqg7D4v1n14/edit#slide=id.gbdce051084_0_3) The discussion in the linked issue ( #48 ) does not give any pro/con discussion of event versus message. |
By restricting to Worker, we do not open the door to out-of-process transforms, this simplify implementations a great deal. |
Still don't understand: How, with transferable streams, does restricting to Worker close the door on out-of-process transforms? I am not necessarily saying that closing the door on out-of-process transforms is a Bad Thing, but I don't understand how the stated tool ("restricting to Worker") accomplishes the task ("do not open the door to out-of-process transforms"). |
Consider reimplementing https://webrtc.github.io/samples/src/content/insertable-streams/video-analyzer/ (the obvious example of an app where background processing doesn't matter) under the event-sending interface: One can write a worker that receives the stream on an event, and then transfers the stream right back to the main thread, using an API that is part of the Streams API and not at all part of the webrtc API. |
The MessagePort code path is much more complex than Worker.postMessage code path. Additional API (transferable streams) can be used for more complex cases on top of webrtc encoded transform. This does not mean we should bake into the simple WebRTC encoded transform API everything that WebRTC encoded transform+transferable streams allow. |
If the goal is to be able to properly shim createEncodedTransform in JS, I think the following can be written:
|
Re #89 (comment) about complexity: Why is the MessagePort code path more complex than Worker.postMessage? For the user or for the browser implementor? Re workaround: Yes, that's exactly what I was suggesting in #89 (comment), and it shows that:
|
Both actually. As of user impact, using a MessagePort in the API will require the user to create a worker, create a MessageChannel, transfer one of the message port to the worker and use the other one in the script transform constructor.
Only if the UA supports transferable streams, which can be implemented in a fully orthogonal way. This orthogonality is decreasing complexity. If there are such use cases, I would first try to see how we could update the API and resort on specific optimization if API change is not feasible.
I am not clear about these few cases. |
Replying in reverse order: The use case I use as an example is https://webrtc.github.io/samples/src/content/insertable-streams/video-analyzer/, where the whole point of the processing is to get data that is presented on the page by code running on the main thread. I mentioned this 21 hours ago in #89 (comment) too. Using a (MessagePort or Worker) will not lead to any user increase in API user complexity for the worker case; that's what I suggested in #89 (comment) - I have not suggested removing Worker from that construct. |
FWIW - MessagePort in Blink is implemented as a Mojo IPC operation, and postMessage() in DedicatedWorkerGlobalScope is implemetned on top of MessagePort. So for Chrome, the underlying mechanism is identical for the two cases. |
I have not looked precisely at what this page is doing but I think the usecase can be defined as:
In the previous message, I was referring to Worker.postMessage, not WorkerGlobalScope.postMessage (which might be the same for shared worker, service worker and dedicated worker in Chrome). |
WPT question: who volunteers to rewrite all the tests to use workers? |
Just my two cents. I'm working on this project involving feeding RTCEncodedVideo/AudioFrame to MSE, and it all works nicely with Chrome's createEncodedStreams. However porting it to Safari / RTCRtpScriptTransform is not exactly trivial because I don't have access to the video element and can't create a MediaSource object from a worker. So having a non-worker alternative would be really appreciated |
If Safari's implementation is the way they proposed in the spec, the RTCRtpScriptTransform is already executing in a worker (the constructor for RTCRtpScriptTransform takes a Worker argument). But I haven't tried coding on Safari. https://w3c.github.io/webrtc-encoded-transform/#dom-rtcrtpscripttransform-rtcrtpscripttransform |
The way to go is to get MSE in a worker which is specified and has consensus from all browser vendors IIRC. |
I had an acronym cache miss on MSE.... |
MSEv2 (like MSEv1) is a bit idiosyncratic with respect to "low latency" mode. So if the goal is low latency and there is no need to support DRM, it might be easier (and might save some containerization cycles) to use WebCodecs for decoding instead. MSEv2 in Workers spec: https://w3c.github.io/media-source/ |
The only reason I use MSE is DRM. Are there any benefits of going through all the trouble of extracting video frames and decoding and rendering them "manually" compared to doing nothing at all and letting webrtc/browser take care of it? |
Currently, WebCodecs doesn't support DRM so MSE is your only alternative. |
Revisiting this issue .... we do have a fundamental disagreement about the need for main-thread support. But it seems to me that if we do it, we could at least agree on a reasonable interface. Abandoning the messageport idea - how would this look? interface RTCRtpScriptTransform : EventTarget { With the obvioius semantics that if worker is not specified, the event is fired at the RTCRtpScriptTransform object. Having this API shape available would (I think) make it possible to support the RTCRtpScriptTransform-based API for the use cases that are supported by the current Chrome interface, and may offer a path to converging the implementations on a single API. |
Seeking comment on #89 (comment) from @youennf and @jan-ivar |
@alvestrand thanks I appreciate its illustrative power of our gap. But we can polyfill that. Providing a polyfill shaped like you propose would take me a bit longer (making the transform a bona fide EventTarget), but here's a fiddle showing main-thread access in today's API (modulo bug 1868223): const bouncer = new Worker(`data:text/javascript,(${bounce.toString()})()`);
function bounce() {
onrtctransform = async ({transformer: {readable, writable}}) =>
self.postMessage({transformer: {readable, writable}}, [readable, writable]);
} To use this bouncer: sender.transform = new RTCRtpScriptTransform(bouncer, {});
const event = await new Promise(r => bouncer.onmessage = r); |
The issue is all browsers implement the WebRTC pipeline off main-thread, yet the |
I don't buy that argument; making something harder because some other thing is hard (the need for RTCPeerConnection in worker has been recognized for years, but isn't being acted on) isn't acting for the benefit of the developers (far less the users). To be clear: I'm offering this option as a means of unifying our approaches so that we can get to a single API that's supported across browsers. Making life more difficult for existing developers (who are using this feature to the tune of hundreds of thousands of instances per day) is not a feature for us. |
Who are these developers with hundreds of thousands of instances per day, and why are they on main-thread? Isn't this concerning, when even the Media WG acknowledged that: “There is a consensus and we agree that media processing in general should happen in a Worker context. Not all use cases require this, though, e.g., non-realtime transcoding.”? Assuming these developers aren't doing non-realtime transcoding, would it be fair to ask why they haven't switched to workers? A worker-first API doesn't have these problems. |
FWIW, the web sites that are using this API I know of are all using workers, where the standard API is slightly simpler than Chrome's version.
Both Chrome API and the standard API are supporting main-thread AFAIK, the standard API is just requiring a few more lines to do main thread processing. |
The reason given by the developers I and @guidou are in contact with is that the benefit of using a worker has been evaluated for their use case, and the performance / quality benefit is negligible or negative. Naturally, our closest contacts are with the Google Meet folks. |
Negligible is fine, negative is very different. FWIW, WebKit used to have RTC networking hitting main thread and we got evidence this was causing issues to actual users. Using a WebRTC encoded transform on main thread would typically reintroduce these issues that Safari users encountered. Why should we be doing this? WebRTC engines like libwebrtc typically try to stay off the main thread in their networking path. |
I'd like to extend the same help to Google Meet folks with adopting Firefox's support of the same API (should they run into any differences from Safari). Mozilla's findings remain that main-thread is a poor environment for realtime code due to the risk of jank. Workers leading to worse quality or worse performance contradict this and would be new information indeed. |
The negligible part was about performance. The negative part was the extra complexity/maintenance burden of introducing a worker in their architecture for no visible gain in performance. |
I'm glad to hear there was no negative performance impact.
The spec API simplifies workers.
I've updated webrtc/samples#1646 to use the spec. It's only a few lines of difference (moving code to worker), and it's more performant, since there was no need to move processing to main thread in the first place. Here's a working demo that works in all browsers. Hopefully this is helpful for those needing help with workers. |
Hi everyone, long time no see. If I understand correctly, forwarding the stream from the Worker to some other I can understand that you want to incentivise the use of Workers in this API (although I'm torn on the chosen method) but I believe allowing for |
Things already being possible in JS generally counts against exposing new APIs, not for it. Also, just because something is possible doesn't make it desirable or without cost. Transferable streams produce a "readable side in another realm" where the original realm feeding that readable in this case is the dedicated worker provided in the constructor.
What benefit? It seems hard to do anything in a worker without a Worker-global entrypoint e.g.
The But a goal of the API is to avoid implicit main-thread, which requires a |
Doesn't the combination of event listener architecture and the constructor's options argument help decentralize things? // main.js
sender.transform = new RTCRtpScriptTransform(worker, {message: "myApp123"});
// worker.js
class MyApp123 {
constructor() {
self.addEventListener("rtctransform", ({transformer}) => {
if (transformer.options.message != "myApp123") return;
this.transformer = transformer;
});
}
}
|
Okay, I think I understand the problem a bit better now you're tackling here. And your suggestion to attach and then immediately detach from the |
The TAG clarified its design principle last year, which I think applies here: With that I propose we can close this issue. A chromium shim to handle backwards compatibility exists in webrtcHacks/adapter#1145 and was demonstrated to work in https://blog.mozilla.org/webrtc/end-to-end-encrypt-webrtc-in-all-browsers. I'm open to improving it as needed. |
I have clearly stated why I believe this is a harmful polyfill in that PR as well as https://webrtchacks.com/end-to-end-encryption-in-webrtc-4-years-later/. @jan-ivar you are free to publish your own shim and convince developers to adopt it I will continue to abstain from this working group while "pointing fingers" at other participants as exemplified by this post is considered acceptable. |
The RTCRtpScriptTransform constructor takes a Worker argument, limiting the usage of this form of the transform to Workers.
The older createEncodedStreams() function was agnostic as to where the processing was going to take place; a number of existing demos and apps have been written that do processing on the main thread; some have even prototyped both worker-based and main-thread-based processing and deliberately chosen main-thread-based processing.
The normal use case should be worker - but other use cases should be possible.
Proposal: Change the argument type of the constructor from Worker to (Worker or MessagePort). Dispatch the event (which could then be a message) on either the worker's implicit port or the explicit MessagePort.
This allows all the use cases that the older API allowed, but ensures that the simplest code will be the one invoking a Worker.
The text was updated successfully, but these errors were encountered: