-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Need for modifying metadata #135
Comments
Another attribute that would be very nice to have is some kind of id that can be matched to unencoded frames processed upstream using the insertable streams api. It seems one reasonable candidate for an id (if we don't want to add a new one) is VideoFrame.timestamp (https://developer.mozilla.org/en-US/docs/Web/API/VideoFrame). It would be nice if this value was propagated through the encoder and attached to the RTCEncodedVideoFrame or RTCEncodedVideoFrameMetadata. Example use: Updating csrc list (part of RTCEncodedVideoFrameMetadata) to match any mixing/compositing done when processing this particular frame upstream to the encoding. |
Hmm, maybe I was confused; I thought we only had RTP timestamp in RTCEncodedVideoFrame, but it also has this timestamp field: https://w3c.github.io/webrtc-encoded-transform/#dom-rtcencodedvideoframe-timestamp It's poorly documented, but if it's specified to equal the VideoFrame.timestamp of the corresponding unencoded frame, it's should be good enough, I think. |
have you seen https://chromium-review.googlesource.com/c/chromium/src/+/3654171 which clarified things a bit? |
Confusing indeed. Seems this was discussed previously on #116. Exposing capture time should be reasonably easy on the send side (prototype cl to do that for video: https://webrtc-review.googlesource.com/c/src/+/265403). I don't really understand the issues in getting that kind of timestamp on the receive side. If it can't easily be set to useful value on receive side, making it optional would be an improvement for use cases I have in mind. |
we can't change the definition of timestamp now :-( |
So what if we add two new attributes to RTCEncodedVideoFrameMetadata:
Does that make sense? We could then consider deprecating the current timestamp attribute which isn't in the "metadata". |
I'd go for captureTime -- similar to how it works in requestVideoFrameCallback - @drkron can explain the details. Downside of this is that it is best effort and requires SR or rtrr for clock offset estimation so it won't be available on every frame. |
I created a pull request, see #137. It seems I may have to create some w3c account, to get it on track for the process? |
indeed @nisse-work although in practice the high-level bit will be your employment situation and its impact on IPR. Please get in touch at [email protected] if you want more details. |
or talk to @alvestrand ;-) |
Back to setMetadata. I can make a pull request to add that, but I need some guidance about how it should work.
|
|
Implementing setMetadata requires going deep into the webrtc stack and deal with a number of objects living in different threads. Might be doable synchronously (as getMetadata is), but it would be prudent to try to sketch an implementation before committing to a sync interface. |
To sketch this API, we need to identify many use cases, especially if we go with setMetadata. As of sync vs. async, if we have to hop to threads, we usually go async. |
I had a quick look at the webrtc implementation of getMetadata, and it appears to return a reference to internal state, and it appeared to make the assumption that it's immutable (and then I suppose chromium makes a copy when packaging it for js access). Making it mutable and properly synchronized looked a bit tricky at first look. So maybe we shouldn't require synchronous operation. |
The current attributes look non-optional to me, so that they can be changed but not deleted. But I guess it's not unlikely that optional attributes will be added later, so good to take that use case into account. |
Synchronous = SetMetadata() returns immediately, having affected the metadata I also imagine that the only time SetMetadata makes sense is after a frame has been created or received, and before enqueueing it to its destination; modifying it at any other time would be a positive invitation to a race condition. webcodecs (https://w3c.github.io/webcodecs/#encodedvideochunk-interface) seems to have ignored frame-attached metadata. Instead, it chose to emit EncodedVideoChunkMetadata on its callback interface, alongside the video frame itself: https://w3c.github.io/webcodecs/#encoded-video-chunk-metadata One of the pieces of their metadata is the "VideoDecoderConfig". |
Regarding sync or async: I think it is fine to have success/failure indication be asynchronous. But it would be nice is setMetadata followed by enqueuing of the frame guarantees that the modification takes effect before the frame goes out on the wire, without the javascript having to explicitly wait for setMetadata completion. |
Would be nice if we could regard an encodedFrame as a pure data object, which would make it easy to say that operations that modify it are synchronoous. |
I came up with another use-case. In the context of a RED encoder I'd like to modify the payload type. Currently it requires me to play a shell game with SDP and switching between opus and red payload types but iff I could modify the payload type that would not be necessary. |
I agree that a sync setMetadata call here makes most sense, as one is only mutating local state which is later passed down into the peerconnection and handed off across threads, as Youenn described above. The RED example does seem like a good concrete instance of a class of usecases where the encoded transform drastically changes the payload such that the existing metadata no longer matches it. I've not been able to find any good way for apps to handle this without such a setMetadata() method. |
In the context of the use case of "take an incoming frame and send it over another PC", we've found the need for setting:
|
I worry this direction will interfere with #141. EncodedVideoChunk and EncodedAudioChunk are immutable by design, simplifying serialization and avoids TOCTOU issues. Instead of making metadata of an existing RTCEncodedVideoFrame mutable, EncodedVideoChunk has a constructor that makes it easy and efficient to copy/transfer-and-modify. Perhaps we could follow a similar pattern here? E.g. const newFrame = new RTCEncodedVideoFrame({
type: frame.type,
data: frame.data,
metadata: frame.getMetadata(),
transfer: [frame.data]
}); |
We agreed on the constructor approach in the December meeting and sent a PR. |
In some applications there is the need to modify the metadata, but the spec only allows a GetMetadata operation.
This is inconvenient if the transform operation means that the metadata really should change.
A PutMetadata function would make it simple.
The text was updated successfully, but these errors were encountered: