-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
G.711 20ms packetized TTS streams on outbound channels are chopped into 10ms on Asterisk 13 #37
Comments
Thanks for such a detailed report. Being extremely busy over these days, I unfortunately could not find the time to investigate the problem, but here are some comments in case they are of any help. Apparently, something has changed internally in Asterisk between 1.8 and 13 versions in the way audio data is processed. MPF callbacks in UniMRCP always carry audio data in 10 ms frames, irrespective of RTP ptime. In other words, if ptime is 20ms, then two frames are required to send an RTP packet out, and vice versa, an RTP packet of 20ms results in two frames of 10ms. Given the logic above, 10ms frames are provided to Asterisk. It is up to Asterisk to compose an RTP stream for the incoming SIP leg based on negotiated parameters, including codec, ptime, etc, which does not seem to be properly reflected, based on your observations. If you change an internal definition of CODEC_FRAME_TIME_BASE in mpf_codec_descriptor.h from 10 to 20, that would make the difference, I guess. However, this would not be a proper solution in general. |
@achaloyan No need to apologize... that's excellent news -- my preliminary debugging was leaning me to the same conclusion and it's a great help to have some validation in that direction. I noticed that mpf_codec_frame_samples_calculate() is calculated based on CODEC_FRAME_TIME_BASE. I will explore whether I can update mpf_codec_frame_samples_calculate() (and related functions) to be driven by the current |
It's looking like there are 2 potential fixes here. I feel like both should eventually be implemented, but solving either one will likely resolve our immediate issue.
I plan to propose these as 2 distinct PRs - so we can discuss whether or not 1 or both fixes are appropriate. |
Anyway, I am certainly open to discuss any suggestions you may have. |
I've updated the description of this defect to note that it only occurs if you are using Asterisk's deprecated chan_sip channel driver; switching to chan_pjsip is one way to resolve the issue. |
Well, thanks for the note. |
Synopsis
Any currently available version of asterisk-unimrcp capable of being installed with Asterisk 13 produces RTP that violates the G.711 standard of 20ms packetization when TTS is sent on an outbound stream.
Asterisk/asterisk-unimrcp takes valid 20ms audio packets received from the TTS server and transcodes them into 10ms packets before sending them out.
Versions Tested
Requirements to Reproduce
TTS Server -> Asterisk/asterisk-unimrcp Audio Stream
Above: TTS Server -> Asterisk/asterisk-unimrcp Audio Stream with proper 20ms packets and negligible sub-millisecond jitter.
Same Audio Stream transcoded and sent-out by Asterisk/asterisk-unimrcp
Above: Asterisk/asterisk-unimrcp -> Outbound Audio Stream transitioning to improper 10ms packets and high jitter when TTS is sent out.
Configuration
mrcp.conf
Note: I've also tried with the following settings appended to the [tts-mrcp1] section, with no change in behavior
extensions.conf
Inbound calls are configured to route to
[corrupted-audio-reproducible-case]
Notes
With verbose RTP debugging in Asterisk enabled (
sudo asterisk -rx 'rtp set debug on'
), it is clear to see that Asterisk transitions from ptime 20ms packets (size 160) to ptime 10ms packets (size 80) while proxying/transcoding TTS:The text was updated successfully, but these errors were encountered: