-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add insert of concealed packets #3308
Conversation
|
||
janus_mutex_lock(&participant->qmutex); | ||
|
||
for(int i=1; i < lost_packets; i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think to get rid of cycle and produce plc data which frame_size
will be lost_packets * OPUS_SAMPLES
and shouldn't exceed BUFFER_SAMPLES
size.
In our production I've seen losses of 20+ packets in a row, so plc for 20 packet won't fit BUFFER_SAMPLES
and I'm not sure how to behave in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's actually the whole thing that perplexes me. The way it's done right now, you're basically breaking FEC, as it would never be done. You're preserving the use_fec = true
bit, but it will be useless, since later on we'll try to get redundant information from a packet that doesn't have any (the PLC data you've added). I think PLC and FEC should be handled separately: if the gap is 1, then do what was done before; if it's higher than 1, do PLC (new code, without use_fec=true). Or do you have a reason for implementing it this way?
As a side not, the LOG_ERR there is probably overly verbose, and would clog the logs any time there's bursts of lost packets, which may happen often. A LOG_VERB or even LOG_HUGE would probably be a better choice (but that's something that can be discussed, it may still be useful to have info on the logs when this happens).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code generates plc for gap sequence, which can't be recovered using fec. Last gap member is generated using fec of arrived rtp packet, if it's possible(opus_decode will return plc in case fec data isn't available), for example:
- gap of 1 will be covered using fec, since condition
i < lost_packets
for cycle wont match - gap of 2 will generate 1 plc and 1 will be recovered from fec information using rtp payload
I'm affraid of the following problem now - frame size can be differ from packet to packet and decoder should adapt to changes of frame size accordingly since encoder may change frame_size. Looks like chrome encoder changes frame size in case of losses - opus_packet_get_nb_frames returns 10ms frame size periodically on bursts of losses for rtp packet after gap. Chromium realization of fec decode uses opus_packet_get_nb_frames to determine fec frame_size https://chromium.googlesource.com/external/webrtc/+/refs/heads/main/modules/audio_coding/codecs/opus/opus_interface.cc#681).
I left LOG_ERR for debug purpose, i'm going to change it to LOG_VERB or LOG_HUGE for production.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! I've added a few notes inline with some considerations that we can discuss. I know @atoppi had some comments of his own too, so I'll leave it to him to add more (and possibly correct me if I was wrong somewhere).
@@ -1699,6 +1699,7 @@ typedef struct janus_audiobridge_participant { | |||
uint16_t expected_seq; /* Expected sequence number */ | |||
uint16_t probation; /* Used to determine new ssrc validity */ | |||
uint32_t last_timestamp; /* Last in seq timestamp */ | |||
uint16_t last_seq; /* Last sequence number */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code style: indentation of comment looks broken here.
@@ -2135,7 +2136,7 @@ static int janus_audiobridge_resample(int16_t *input, int input_num, int input_r | |||
#define JITTER_BUFFER_MIN_PACKETS 2 | |||
#define JITTER_BUFFER_MAX_PACKETS 40 | |||
#define JITTER_BUFFER_CHECK_USECS 1*G_USEC_PER_SEC | |||
#define QUEUE_IN_MAX_PACKETS 4 | |||
#define QUEUE_IN_MAX_PACKETS 20 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally don't like this: it will cause a permanent delay increase for audio of this participant after a burst of lost packets caused this to grow because of PLC, ven way after the problem was solved. I'd argue that a burst of lost packets should indeed cause artifacts (like a speed up), and that if not 4, then a way smaller number should be more than enough to absorb shorter bursts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've expiremented with it and low value caused pops/cracks for small bursts.
|
||
janus_mutex_lock(&participant->qmutex); | ||
|
||
for(int i=1; i < lost_packets; i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's actually the whole thing that perplexes me. The way it's done right now, you're basically breaking FEC, as it would never be done. You're preserving the use_fec = true
bit, but it will be useless, since later on we'll try to get redundant information from a packet that doesn't have any (the PLC data you've added). I think PLC and FEC should be handled separately: if the gap is 1, then do what was done before; if it's higher than 1, do PLC (new code, without use_fec=true). Or do you have a reason for implementing it this way?
As a side not, the LOG_ERR there is probably overly verbose, and would clog the logs any time there's bursts of lost packets, which may happen often. A LOG_VERB or even LOG_HUGE would probably be a better choice (but that's something that can be discussed, it may still be useful to have info on the logs when this happens).
pkt->data = g_malloc0(OPUS_SAMPLES * (participant->stereo ? 2 : 1) * sizeof(opus_int16)); | ||
pkt->ssrc = 0; | ||
pkt->timestamp = participant->last_timestamp + 960 * i; | ||
pkt->seq_number = participant->last_seq + 1 * i; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not 100% sure the RTP info here is correct, since last_timestamp
and last_seq
may come from out of order packets. But it's probably not that relevant, since it has a limited use anyway in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought jitter buffer is reordering out of order packets and duplicates? At least I've expiremented with out of order and duplicates on output and everything still worked fine
/* This is a redundant packet, so we can't parse any extension info */ | ||
pkt->silence = FALSE; | ||
/* Decode the lost packet using fec=1 */ | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why you completely changed the way we handle FEC? The existing FEC management should remain the same, if not broken. See comments above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FEC management which were implemented earlier didn't work in case of gap > 1, I didn't change too much, just adapted it to work using latest arrived rtp packet info instead of expected value based on last decoded packet.
pkt->length = opus_decode(participant->decoder, payload, plen, (opus_int16 *)pkt->data, output_samples, 1); | ||
|
||
JANUS_LOG(LOG_ERR, "[%d] packet fec decoded [%d] pkt->length, timestamp: [%d]\n", | ||
pkt->seq_number, pkt->length, pkt->timestamp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above on LOG_ERR usage.
@@ -8738,7 +8768,7 @@ static void *janus_audiobridge_participant_thread(void *data) { | |||
locked = TRUE; | |||
/* Do not let queue-in grow too much */ | |||
guint count = g_list_length(participant->inbuf); | |||
if(count > QUEUE_IN_MAX_PACKETS) { | |||
if((int) count > (QUEUE_IN_MAX_PACKETS + lost_packets)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once you increase QUEUE_IN_MAX_PACKETS, then there's no reason to make it even less strict like you're doing now IMHO. It will make the internal mixer queue for the participant grow beyond control and cause huge and uncoverable latencies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@spscream have you tried what happens if, instead of ingesting |
Thanks @spscream for the effort! My objection to the implementation is related to the timing of PLC packets generation. FEC needs packet N+1 to decode packet N (so it will naturally introduce/leverage a playout delay), meanwhile PLC is only calculated from already decoded samples. According to this assumption, the generation of a PLC packet should be triggered in a condition where the bridge encoder "needs something from a participant but there's nothing available", and not after the arrival of a random packet not adjacent to the previous one. Reusing FEC approach will inevitably lead to issues like unwanted delay, discarding of good packets, uncontrollable queue growth etc. I still have to think about a better approach in detail, however one idea is to trigger a PLC whenever a participant's queue-in is empty (e.g. detected participant underflow). That means that a loss burst has exhausted both the jitter buffer and the participant's queue (decoded samples), hence the mixer has no to other chance but introduce concealment for the participant. |
yes, I tried it and sound cracks |
I can insert n packets of maximum size instead of number of loss packets, but plc should cover time period of lost bursts accordingly to docs. Number of samples per channel of available space in pcm. If this is less than the maximum packet duration (120ms; 5760 for 48kHz), this function will not be capable of decoding some packets. In the case of PLC (data==NULL) or FEC (decode_fec=1), then frame_size needs to be exactly the duration of audio that is missing, otherwise the decoder will not be in the optimal state to decode the next incoming packet. For the PLC and FEC cases, frame_size must be a multiple of 2.5 ms. |
This buffer underrun can explain why we still have this cracks/pops for our production users. But I afraid I can't rewrite it to such aproach myself or at least not this year. |
I agree with @atoppi that it's the road we should explore, even though it does come with its own challenges: for one, the mixer is the one who'd read from that queue, and it uses a different thread, meaning I'd prefer it not to be the one that should perform that PLC decode. As such, we'll have to better think of how to do that in the participant thread instead. This is something we can definitely discuss together after the vacations, no need to hurry and no need for you to actually implement it yourself either, if we manage to replicate the problem and come up with a potential fix. |
@atoppi By the way I also discovered possible incorrect usage of JITTER_BUFFER_SET_MARGIN in current implementation - its value should be the same unit as the |
I don't think so. Besides, we're changing what actually ends in the buffer: not the payload anymore, but a pointer to a duplicate of the |
I get the following situation with buffering if
if I multiply
|
@spscream yeah you're probably right, the comments in the speexdsp code are probably outdated/misleading. |
I've tried another approach here and it works in our case even with huge loses: master...spscream:janus-gateway:fix/am/audiobridge-cracks-fix-rnnoise#diff-476cfd26c102e4b3311b3b8e8f848bfeba283c1712a5f3f117254e0593a986f4R8959 @atoppi could you please look on it? If it is ok, i could refactor it and make another pr for you. Idea is the following:
I also tried to insert fec, but it seams it also leads to cracks |
@spscream the diff is huge and includes unrelated changes, can you please provide a more readable patch? |
@atoppi f5361e7#diff-476cfd26c102e4b3311b3b8e8f848bfeba283c1712a5f3f117254e0593a986f4R8745 sure, here is commit with added changes |
@atoppi hi! any thoughts? |
If it is ok, I could make a new pr with my latest changes if it will be more convenient |
@spscream If you're able to propose a cleaner version of the patch it would greatly help thanks. |
this changes were created to address #3297 for our environment.
I'm not sure if it is fully correct, but it mostly fixes our problems.