Highlight

Release Streaming ASR and Streaming TTS system for industrial application.
Support KWS model
Deepspeech2 streaming model aishell cer 6.66%
Conformer aishell cer 4.64%
Conformer streaming model aishell cer 5.44%
SpeechX Deepspeech2 streaming with WFST

What's Changed

[speechx] refactor audio/data/feature cache by @zh794390558 in #1638
[speechx] Frontend refactor by @zh794390558 in #1640
[speechx] fix nnet itf header by @zh794390558 in #1641
[TTS]add license and reference for some models by @yt605155624 in #1642
[Doc] supplement note by @Jackwaterveg in #1643
[vec][search] update search demo README by @qingen in #1644
[speechx]refactor linear feature:unify vector & remove redundant function & add remained_wav cache shift wav by @SmileGoat in #1649
[Audio] Fix mcd issue. by @KPatr1ck in #1658
[Audio] Remove mcd. by @KPatr1ck in #1659
[vec]update the speaker verification model by @honei in #1663
[ASR] update ds2 online model by @Jackwaterveg in #1668
[TTS]fix preprocess bug, test=tts by @yt605155624 in #1660
update README, test=doc by @iftaken in #1672
[Punc] Update RESULTS.md. by @KPatr1ck in #1675
[CLI] update ds2 online model in cli by @Jackwaterveg in #1674
[CLI] ASR: Add duration limitation for asr by @Jackwaterveg in #1666
[vec]add speaker verification score method by @honei in #1646
[TTS]add onnx inference for fastspeech2 + hifigan/mb_melgan by @yt605155624 in #1665
[doc]update readme by @yt605155624 in #1680
[WebSocket] fixed online model md5 error , test=doc by @WilliamZhang06 in #1682
[speechx]add aishell test script & json parser & no db norm linear feature & json2kaldi type cmvn by @SmileGoat in #1676
[server] add stream tts server by @lym0302 in #1652
[speechx]remove mutable in audio_cache by @SmileGoat in #1687
[Doc] update readem for aishell/asr0 by @Jackwaterveg in #1677
[vec] add speaker diarization pipeline by @ccrrong in #1651
[vec]voxceleb convert dataset format to paddlespeech by @honei in #1630
[Speechx] add tlg decoder by @SmileGoat in #1599
[vec]add vector necessary note, test=doc by @honei in #1690
Revert "[WebSocket] fixed online model md5 error , test=doc" by @zh794390558 in #1691
[WebSocket] added online web client, test=doc by @WilliamZhang06 in #1692
修复 example/aishell 目录中speech单词拼写错误问题 by @buchongyu2 in #1694
修改hack 单词拼写错误 by @buchongyu2 in #1697
[TTS]change NLC to NCL in speedyspeech, test=tts by @yt605155624 in #1693
[doc]fix typo, test=doc by @yt605155624 in #1698
[doc]add pwgan onnx model, test=doc by @yt605155624 in #1700
[WebSocket] added online asr doc and online asr command line, test=doc by @WilliamZhang06 in #1701
[vec][server] vpr demo support by @qingen in #1696
[speechx] refactor speech egs by @zh794390558 in #1707
[asr]add wer tools by @zh794390558 in #1709
[asr][websocket]fix the ws send bug, cache buffer, text=doc by @honei in #1710
[TTS]add fastspeech2 cnndecoder onnx model by @yt605155624 in #1712
[speechx] refactor egs and more egs for TLG wfst graph build by @zh794390558 in #1715
[vec][score] add plda model by @qingen in #1681
[CLI]update cli, test=doc by @yt605155624 in #1716
[server] add streaming am infer by @lym0302 in #1713
[speechx] Add websocket & make it work by @SmileGoat in #1720
[asr][websocket] add asr conformer websocket server by @honei in #1704
[vec][loss] add NCE Loss from RNNLM by @qingen in #1719
[vec][loss] add FocalLoss to deal with class imbalances by @qingen in #1722
[TTS]restructure syn_utils.py, test=tts by @yt605155624 in #1723
[TTS]add paddle device set for ort and inference by @yt605155624 in #1727
[vec] add GRL to domain adaptation by @qingen in #1725
[speechx] speedup ngram building by @zh794390558 in #1729
[asr] Add new cer tools by @Jackwaterveg in #1673
[speechx]add websocket lib by @SmileGoat in #1732
[speechx]update speechx install doc by @zh794390558 in #1736
[Doc] prefect the packing scripts by @Jackwaterveg in #1735
[Doc]renew the released mode by @Jackwaterveg in #1739
[asr][websocket]add streaming asr demo by @honei in #1737
[speechx] fix nnet input and output name by @zh794390558 in #1740
[ASR] remove redundant log by @Jackwaterveg in #1741
[speechx] update wfst graph by @zh794390558 in #1742
[speechx] Add recognizer_test_main script by @SmileGoat in #1743
[vec][doc]update the voxceleb readme.md, test=doc by @honei in #1744
[ASR] fix CER tools by @Jackwaterveg in #1747
[Doc] Fix release_model info by @Jackwaterveg in #1746
[Doc] Updata released model info by @Jackwaterveg in #1748
Updata released model info by @Jackwaterveg in #1749
[speechx] fix model params path name by @zh794390558 in #1750
[speechx] fix linear-spectrogram-wo-db-norm-ol read feature issue by @SmileGoat in #1751
[TTS]fix wavernn white noise bug for paddle develop(2.3) by @yt605155624 in #1752
[server] add onnx tts engine by @lym0302 in #1733
[TTS]Update paddle2onnx by @yt605155624 in #1754
[Setup] to r1.0.0a by @Jackwaterveg in #1759
[audio] rename paddleaudio to audio, since confilict with pkg name by @zh794390558 in #1758
[speechx] to_float32, fix shell script by @zh794390558 in #1757
[vec] bug fix to adapt VUE by @qingen in #1760
[asr][weboscket]fix the streaming asr server bug, server client by @honei in #1761
[speechx] fbank and mfcc by @zh794390558 in #1765
format code by @zh794390558 in #1764
[CLI] Add conformer_aishell, conformer_online_aishell by @Jackwaterveg in #1767
[speechx]make cmvn global in run.sh by @SmileGoat in #1768
[ASR] ds2: add log_interval and fix lr problem when resume training by @Jackwaterveg in #1766
[speechx] set nnet param by flags by @zh794390558 in #1769
[server] add streaming tts demos by @lym0302 in #1771
[server] fix tts streaming server by @lym0302 in #1774
[KWS]Add kws example on HeySnips dataset. by @KPatr1ck in #1558
[text][server]add text punc server by @honei in #1772
[ASR] fix asr cli infer by @Jackwaterveg in #1770
[vec] add GE2E to support unlabeled data training by @qingen in #1731
[ASR] fix time restricion in test_cli.sh by @Jackwaterveg in #1777
[ASR] Replace fbank by @Jackwaterveg in #1776
[CLI] add color for test_cli by @Jackwaterveg in #1778
[speechx] add sucess log in run.sh by @SmileGoat in #1779
[KWS]Update KWS example. by @KPatr1ck in #1783
[server] update readme by @lym0302 in #1782
[Doc] Update ds2online model info by @Jackwaterveg in #1781
[CLI] renew ds2 online model by @Jackwaterveg in #1786
[speechx] fix speechx ws server to return dummpy partial result by @zh794390558 in #1787
[asr][server]asr client add punctuatjion server by @honei in #1784
[asr] patch func to var by @zh794390558 in #1788
[asr][server]fix client parse the asr result bug by @honei in #1789
[Bug fix] fix test_cli by @Jackwaterveg in #1794
[vec] update readme by @qingen in #1796
[R1.0]update the streaming output and punc default ip, port by @honei in #1800
Renew ds2 online model [cer 6.66%] by @Jackwaterveg in #1802
[R1.0] update the streaming asr server readme by @honei in #1810
[R1.0] Renew ds2 online doc info by @Jackwaterveg in #1809
[server] update streaming demos readme by @lym0302 in #1806
[R1.0]update the paddlespeech_client asr_online cli by @honei in #1818
[r1.0][doc] fix readme by @zh794390558 in #1825

New Contributors

@iftaken made their first contribution in #1672
@ccrrong made their first contribution in #1651
@buchongyu2 made their first contribution in #1694

Acknowledgements

Special thanks to @zh794390558 @honei @Jackwaterveg @lym0302 @qingen @GT-ZhangAcer @yt605155624 @WilliamZhang06 @SmileGoat @ccrrong

Full Changelog: r0.2.0...r1.0.0a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PaddleSpeech r1.0.0a

Highlight

What's Changed

New Contributors

Acknowledgements

Contributors