Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Layer drop/add #4

Open
aboba opened this issue Apr 10, 2019 · 9 comments
Open

Layer drop/add #4

aboba opened this issue Apr 10, 2019 · 9 comments
Assignees
Labels
enhancement New feature or request

Comments

@aboba
Copy link
Contributor

aboba commented Apr 10, 2019

If an application wants to drop or add a layer, how can this be accomplished? For example, if there are 3 spatial scalability layers and the application wants to drop the highest resolution layer, how is this accomplished?

In the ORTC API, this would be accomplished by setting active to false for the highest resolution layer.

In the WebRTC-SVC extension we could call setParameters with a new scalabilityMode value. For example, if scalabilityMode is "L3T3", calling setParameters with scalabilityMode set to "L2T3" drops the highest resolution scalability layer.

However, this is less efficient than the ORTC active approach (and may result in artifacts), because dropping a layer seamlessly may require utilization of applyconstraints to change the input track characteristics, as well as changing scalabilityMode.

Example: Assume scalabilityMode is "L3T3" and input track is 1280 x 960 @ 60 fps

S2 = 1280 x 960 , S1 = 640 x 480, S0 = 320 x 240
T2 = 60 fps, T1 = 30 fps, T0 = 15 fps

Seamless transition from "L3T3" -> "L2T3" (dropping highest resolution layer) requires changing the input track from 1280 x 960 @ 60 fps to 640 x 480 @ 60 fps

"L3T3" -> "L3T2" (dropping highest temporal layer) requires changing the input track from 1280 x 960 @ 60 fps to 1280 x 960 @ 30 fps

scalabilitymode values and characteristics:
"L3T3": 320 x 240, 640 x 480, 1280 x 960 at 15, 30, 60 fps (with an input track of 1280 x 960 @ 60 fps)
"L2T3": 320 x 240, 640 x 480 at 15, 30, 60 fps (with an input track of 640 x 480 @ 60 fps)
"L3T2": 320 x 240, 640 x 480, 1280 x 960 at 15, 30 fps (with an input track of 1280 x 960 @ 30 fps)

@aboba aboba self-assigned this Apr 10, 2019
@aboba aboba added the question Further information is requested label Apr 10, 2019
@ibc
Copy link

ibc commented Jul 17, 2019

Seamless transition from "L3T3" -> "L2T3" (dropping highest resolution layer) requires changing the input track from 1280 x 960 @ 60 fps to 640 x 480 @ 60 fps

Why? why is any change to the input track required at all? the encoder must just encode spatial layer 0 and 1 (or also encode spatial layer 2 and do not send it). Do I miss something?

NOTE: Of course that calling track.applyConstraints() makes sense in this scenario to save CPU for a layer that is not gonna be sent to the remote. But, why is it a requirement at all?

@aboba
Copy link
Contributor Author

aboba commented Jul 17, 2019

It would be desirable to encode all the spatial layers and just not send spatial layer 2 - but the API proposal does not currently provide a way to do that. The example is just trying to point that out.

For example, if setParameters() were used to change scalabilityMode from "L3T3" to "L2T3" with no other change, and the input track remains the same (1280 x 960 @ 60 fps), the new configuration would be:

S1 = 1280 x 960 , S0 = 640 x 480
T2 = 60 fps, T1 = 30 fps, T0 = 15 fps

This is not what we wanted - in effect we have dropped the lowest resolution layer, not the highest resolution layer. We could change the input track from 1280 x 960 @ 60 fps to 640 x 480 @60 fps, which would then give us:

S1 = 640 x 480, S0 = 320 x 240
T2 = 60 fps, T1 = 30 fps, T0 = 15 fps

However, while the eventual configuration is what we want, there may be side effects, such as a visible "glitch" as track.applyConstraints() is applied and the new scalabilityMode setting is applied. So that is not really ideal either.

Note that I have opened a similar issue against the AV1 RTP payload specification: AOMediaCodec/av1-rtp-spec#52

A proposed fix is being discussed, which includes both a change to the AV1 bitstream (Metadata OBU) and the AV1 Descriptor. So this issue is about enabling the API to make use of that functionality when it becomes available.

@ibc
Copy link

ibc commented Jul 17, 2019

I understand it now, thanks. The problem is that, with the current webrtc-svc proposal, we no longer have access to existing "encodings" in the RtpSender. IMHO the app SHOULD NOT be able to change scalabilityMode since it has tons of encoding implications as you pointed out above. scalabilityMode should be an initial and non-updatable setting (similar to the number of entries in the encodings array when simulcast is enabled).

Assuming we don't want to let the app dynamically limit the number of temporal layers to be sent (what for?) I think we need some new API like the following:

rtpSender.setMaxSpatialLayer(idx)

which may work for both simulcast and SVC.

@aboba aboba added enhancement New feature or request and removed question Further information is requested labels Jul 17, 2019
@ibc
Copy link

ibc commented Jul 17, 2019

Well, there is a difference: In simulcast we can (and it's feasible today) switch off any stream (low, medium or high) since they are not dependent. However in SVC we cannot switch off the spatial layer 0 (not sure if possible in K-SVC, but that's another story) so it's hard to share a common API to manage spatial layers in both simulcast and SVC.

Perhaps the API I propose above should be just for SVC (in fact it has "spatial layer" in its name, which is just possible in SVC).

@aboba aboba removed the enhancement New feature or request label Jul 21, 2019
@aboba
Copy link
Contributor Author

aboba commented Jul 21, 2019

Since the existing WebRTC API enables simulcast layers to be made active/inactive, I agree that layer drop is only an API issue for SVC.

The VLA RTP header extension makes it possible to indicate whether a layer has been dropped.

@aboba aboba added the enhancement New feature or request label Jul 21, 2019
@alvestrand
Copy link
Contributor

If we have "enable=false" for a layer (which implies "enable=false" for all dependent layers), I don't see a great reason to be able to add/drop layers.
The app can just start with the mode corresponding to the maximum number of layers it wants to handle, and disable the ones it doesn't want at the moment.

@aboba
Copy link
Contributor Author

aboba commented Sep 21, 2022

enable = "false" currently will disable a stream, not a layer within a stream. So you can disable a simulcast layer, but not drop a temporal layer. WebCodecs provides an application that level of control (by encoding but not sending a layer), but WebRTC-SVC does not.

@alvestrand
Copy link
Contributor

Yes, "if we have" because we currently don't have it for SVC; I was suggesting that if we add an "enable" control for a layer, the result would be the same (no frames are sent for that layer or any dependent layer). The sender would still see all the layers as existing, and be able to turn them on again; the receiver would see the layer as dropped until resumed.

(This is assuming that the VLA RTP header extension allows you to re-add layers, not just drop them. If not, it might need to achieve that capability.)

@aboba
Copy link
Contributor Author

aboba commented Oct 4, 2022

Related to Issue #14 which makes it possible to enable/disable spatial layers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants