PaddleSpeech r1.0.0
Highlight
- Release PP-ASR: Streaming ASR with timestamp and punctuation restoration, uses WenetSpeech Streaming Conformer and DeepSpeech2 ASR model.
- Release PP-TTS: Streaming TTS system for industrial application.
- Release PP-VPR: Industrial Voiceprint Recognition system and ECAPA-TDNN model.
- Custom ASR apply reimbursement for transportation
- Support MDTC KWS model
More
ASR
- DeepSpeech2 streaming model aishell cer 6.66%
- DeepSpeech2 streaming model wenetspeech cer: 15.2% (test_net, w/o LM), 24.17% (test_meeting, w/o LM), 5.3% (aishell, w/ LM)
- Conformer aishell cer 4.64%
- Conformer streaming model aishell cer 5.44%
- Conformer streaming model wenetspeech cer: 11.0% (test_net), 18.79% (test_meeting)
Speechx
- [SpeechX] DeepSpeech2 streaming with WFST in streaming asr example
- [SpeechX] Add websocket websocket example
- [SpeechX] custom asr, apply reimbursement for transportation demo
KWS
- [KWS] Add kws example on HeySnips dataset. by @KPatr1ck in #1558
- [KWS] Update KWS example. by @KPatr1ck in #1783
Audio
- [Audio] rename paddleaudio to audio, since confilict with pkg name by @zh794390558 in #1758
- [Audio] Fix mcd issue. by @KPatr1ck in #1658
- [Audio] Remove mcd. by @KPatr1ck in #1659
- [Audio] Add
VoxCeleb
dataset for speaker recognition. - [Audio] Add
HeySnips
dataset for keyword spotting.
What's Changed
- [R1.0][asr][server]add vector server by @honei in #1845
- [R1.0][asr][server]join streaming asr and punc server by @honei in #1846
- [R1.0]asr streaming server add time stamp by @honei in #1850
- [R1.0][tts][server] update readme by @lym0302 in #1852
- [R1.0] update cli by @Jackwaterveg in #1854
- [r1.0] update version to r1.0.0 by @zh794390558 in #1857
- [R1.0] Add doc for wenetspeech model (ds2 online, conformer online) by @Jackwaterveg in #1862
- [R1.0][server] improve server code by @lym0302 in #1866
- [R1.0][asr][server]update the streaming asr readme by @honei in #1871
- [R1.0] Updata released model info ( Wenetspeech ds2 online, conformer online) by @Jackwaterveg in #1869
- [R1.0]fix server doc and decode_method by @Jackwaterveg in #1889
- [speechx] add custom_streaming_asr @SmileGoat #1891
- [speechx] speedup ngram building @zh794390558 #1729
- [speechx] refactor egs and more egs for TLG wfst graph build @zh794390558 #1715
- [speechx]add aishell test script & json parser & no db norm linear feature & json2kaldi type cmvn @SmileGoat #1676
- [speechx] Add websocket & make it work @SmileGoat #1720
- [speechx] Frontend refactor @SmileGoat #1640
- [Speechx] add tlg decoder @SmileGoat #1599
Full Changelog: r1.0.0a...r1.0.0