Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for helper threads to VideoRoom #3067

Merged
merged 10 commits into from
Apr 10, 2024
Merged

Conversation

lminiero
Copy link
Member

This patch adds support for the so-called "helper threads" to the VideoRoom. If you're familiar with the Streaming plugin, in that context helper threads allow mountpoints to leverage some threads to distribute the incoming RTP packets to the indended audience, rather than take care of the process in the same thread that receives the packets themselves: that feature was added a few years back in #1316, and helped the Streaming plugin greatly increase performance in the presence of a lot of subscribers.

The VideoRoom plugin, instead, never got the same treatment: as such, there's always been a critical path when handling incoming media packets from a publisher to relay to subscribers. More precisely, this was the path:

  1. the publisher's ICE handle loop receives a packet, in janus_ice_cb_nice_recv;
  2. from there, incoming_rtp is called on the VideoRoom plugin for that publisher;
  3. that function, among other things, iterates on all subscribers to perform a relay_rtp of the packet for each of them;
  4. for each subscriber, relay_rtp queues the packet for delivery in the Janus core.

All four steps happen within the context of the same thread (the one handling publisher's core loop). In case the publisher has too many subscribers, that sequence may take longer than it takes for the following packets to arrive, causing the thread to struggle. This was exactly the main cause of bottlenecks in the Streaming plugin, which led us to add helper threads: the moment helper threads are added, the only entities incoming_rtp must pass packets to are the helpers themselves, which will then take care of steps 3. and 4. on their own. As a consequence, the publisher's loop is never kept busy, and can keep on processing incoming packets in a timely way.

As it did for the Streaming plugin, this patch is supposed to increase the performance of the VideoRoom plugin in case a single publisher has a lot of subscribers. Notice that this doesn't magically allow infinite subscribers for a publishers: it just makes the process more efficient, allowing to better use the available CPU resources; leveraging multiple Janus servers (e.g., via #3014) is still needed in case you want to increase an audience beyond the capabilities of a single machine.

In this first iteration, I kept the configuration very simple: if you add a threads: <number> property when creating a room, then all publishers that connect to the room will automatically spawn that number of threads. This happens the moment a publisher joins, not when they publish, and they stay there until the publisher handle is closed: this means it's probably suboptimal as it is, because if you take into account users that will only subscribe, if they create a publisher handle to receive signalling events (as they should) helper threads will be created for them too, needlessly. I'll probably revisit this later on, before or after this is merged (possibly allowing only for specific publishers to have threads, e.g., those that will actually need them), but for the time being that's good enough to experiment with this feature.

I tested this briefly and it seems to be doing its job, but I haven't really stressed it out. I don't plan to merge really soon, so in case you're interested in increasing the performance of the VideoRoom because you'll have many users receiving media, please do take the time to test this and report problems.

@lminiero lminiero added the multistream Related to Janus 1.x label Sep 16, 2022
@lminiero lminiero marked this pull request as draft September 16, 2022 10:50
@atoppi
Copy link
Member

atoppi commented Sep 16, 2022

I am not totally sure that applying the same threading model of the streaming plugin is the best approach here.
We are basically assuming that a sfu room is equivalent to a group of 1-to-many streams and that is not correct in my opinion.

  1. The number of sfu helper threads depends on the users behavior (how many publishers join), whereas the mountpoint helper threads are spawned when mountpoints are allocated (usually through a management API).
  2. This model might work if the number of pubs M is low and << than the number of subscribers N, but when the number of publishers grows I suspect that the current implementation might perform better

The strategy to adopt should not address a single use case, but fit all the possibile patterns.
I'd rather aim at a model with a fixed number of helper threads per room, not per participant.

@lminiero
Copy link
Member Author

I changed the way helper threads work in the VideoRoom, by moving them to rooms instead of participants as it was before. So now, when you create a room, you can specify a number of helper threads to assist with relaying (and free the source from having to block and queue outgoing packets itself). These threads are then shared among all publishers in the room, meaning subscriptions are distributed on the available helper threads indipendently of the publisher source. This should help address Alessandro's comment above, and make the feature "controllable".

I'm keeping the draft status of the PR, since I haven't tested extensively, something that will probably need to be done considering the context of helpers changed, and so some race conditions may be hiding in the bushes (e.g., when participants leave, or rooms are destroyed). If you care about making the VideoRoom plugin more scalable, please do test and provide feedback.

@lminiero
Copy link
Member Author

Merging.

@lminiero lminiero merged commit 6f5d3ea into master Apr 10, 2024
8 checks passed
@lminiero lminiero deleted the videoroom-helper-threads branch April 10, 2024 09:21
@lminiero lminiero mentioned this pull request Oct 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
multistream Related to Janus 1.x
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants