PaddleSpeech r1.0.0a
Jackwaterveg
released this
28 Apr 04:59
·
1567 commits
to develop
since this release
Highlight
- Release Streaming ASR and Streaming TTS system for industrial application.
- Support KWS model
- Deepspeech2 streaming model aishell cer 6.66%
- Conformer aishell cer 4.64%
- Conformer streaming model aishell cer 5.44%
- SpeechX Deepspeech2 streaming with WFST
What's Changed
- [speechx] refactor audio/data/feature cache by @zh794390558 in #1638
- [speechx] Frontend refactor by @zh794390558 in #1640
- [speechx] fix nnet itf header by @zh794390558 in #1641
- [TTS]add license and reference for some models by @yt605155624 in #1642
- [Doc] supplement note by @Jackwaterveg in #1643
- [vec][search] update search demo README by @qingen in #1644
- [speechx]refactor linear feature:unify vector & remove redundant function & add remained_wav cache shift wav by @SmileGoat in #1649
- [Audio] Fix mcd issue. by @KPatr1ck in #1658
- [Audio] Remove mcd. by @KPatr1ck in #1659
- [vec]update the speaker verification model by @honei in #1663
- [ASR] update ds2 online model by @Jackwaterveg in #1668
- [TTS]fix preprocess bug, test=tts by @yt605155624 in #1660
- update README, test=doc by @iftaken in #1672
- [Punc] Update RESULTS.md. by @KPatr1ck in #1675
- [CLI] update ds2 online model in cli by @Jackwaterveg in #1674
- [CLI] ASR: Add duration limitation for asr by @Jackwaterveg in #1666
- [vec]add speaker verification score method by @honei in #1646
- [TTS]add onnx inference for fastspeech2 + hifigan/mb_melgan by @yt605155624 in #1665
- [doc]update readme by @yt605155624 in #1680
- [WebSocket] fixed online model md5 error , test=doc by @WilliamZhang06 in #1682
- [speechx]add aishell test script & json parser & no db norm linear feature & json2kaldi type cmvn by @SmileGoat in #1676
- [server] add stream tts server by @lym0302 in #1652
- [speechx]remove mutable in audio_cache by @SmileGoat in #1687
- [Doc] update readem for aishell/asr0 by @Jackwaterveg in #1677
- [vec] add speaker diarization pipeline by @ccrrong in #1651
- [vec]voxceleb convert dataset format to paddlespeech by @honei in #1630
- [Speechx] add tlg decoder by @SmileGoat in #1599
- [vec]add vector necessary note, test=doc by @honei in #1690
- Revert "[WebSocket] fixed online model md5 error , test=doc" by @zh794390558 in #1691
- [WebSocket] added online web client, test=doc by @WilliamZhang06 in #1692
- 修复 example/aishell 目录中speech单词拼写错误问题 by @buchongyu2 in #1694
- 修改hack 单词拼写错误 by @buchongyu2 in #1697
- [TTS]change NLC to NCL in speedyspeech, test=tts by @yt605155624 in #1693
- [doc]fix typo, test=doc by @yt605155624 in #1698
- [doc]add pwgan onnx model, test=doc by @yt605155624 in #1700
- [WebSocket] added online asr doc and online asr command line, test=doc by @WilliamZhang06 in #1701
- [vec][server] vpr demo support by @qingen in #1696
- [speechx] refactor speech egs by @zh794390558 in #1707
- [asr]add wer tools by @zh794390558 in #1709
- [asr][websocket]fix the ws send bug, cache buffer, text=doc by @honei in #1710
- [TTS]add fastspeech2 cnndecoder onnx model by @yt605155624 in #1712
- [speechx] refactor egs and more egs for TLG wfst graph build by @zh794390558 in #1715
- [vec][score] add plda model by @qingen in #1681
- [CLI]update cli, test=doc by @yt605155624 in #1716
- [server] add streaming am infer by @lym0302 in #1713
- [speechx] Add websocket & make it work by @SmileGoat in #1720
- [asr][websocket] add asr conformer websocket server by @honei in #1704
- [vec][loss] add NCE Loss from RNNLM by @qingen in #1719
- [vec][loss] add FocalLoss to deal with class imbalances by @qingen in #1722
- [TTS]restructure syn_utils.py, test=tts by @yt605155624 in #1723
- [TTS]add paddle device set for ort and inference by @yt605155624 in #1727
- [vec] add GRL to domain adaptation by @qingen in #1725
- [speechx] speedup ngram building by @zh794390558 in #1729
- [asr] Add new cer tools by @Jackwaterveg in #1673
- [speechx]add websocket lib by @SmileGoat in #1732
- [speechx]update speechx install doc by @zh794390558 in #1736
- [Doc] prefect the packing scripts by @Jackwaterveg in #1735
- [Doc]renew the released mode by @Jackwaterveg in #1739
- [asr][websocket]add streaming asr demo by @honei in #1737
- [speechx] fix nnet input and output name by @zh794390558 in #1740
- [ASR] remove redundant log by @Jackwaterveg in #1741
- [speechx] update wfst graph by @zh794390558 in #1742
- [speechx] Add recognizer_test_main script by @SmileGoat in #1743
- [vec][doc]update the voxceleb readme.md, test=doc by @honei in #1744
- [ASR] fix CER tools by @Jackwaterveg in #1747
- [Doc] Fix release_model info by @Jackwaterveg in #1746
- [Doc] Updata released model info by @Jackwaterveg in #1748
- Updata released model info by @Jackwaterveg in #1749
- [speechx] fix model params path name by @zh794390558 in #1750
- [speechx] fix linear-spectrogram-wo-db-norm-ol read feature issue by @SmileGoat in #1751
- [TTS]fix wavernn white noise bug for paddle develop(2.3) by @yt605155624 in #1752
- [server] add onnx tts engine by @lym0302 in #1733
- [TTS]Update paddle2onnx by @yt605155624 in #1754
- [Setup] to r1.0.0a by @Jackwaterveg in #1759
- [audio] rename paddleaudio to audio, since confilict with pkg name by @zh794390558 in #1758
- [speechx] to_float32, fix shell script by @zh794390558 in #1757
- [vec] bug fix to adapt VUE by @qingen in #1760
- [asr][weboscket]fix the streaming asr server bug, server client by @honei in #1761
- [speechx] fbank and mfcc by @zh794390558 in #1765
- format code by @zh794390558 in #1764
- [CLI] Add conformer_aishell, conformer_online_aishell by @Jackwaterveg in #1767
- [speechx]make cmvn global in run.sh by @SmileGoat in #1768
- [ASR] ds2: add log_interval and fix lr problem when resume training by @Jackwaterveg in #1766
- [speechx] set nnet param by flags by @zh794390558 in #1769
- [server] add streaming tts demos by @lym0302 in #1771
- [server] fix tts streaming server by @lym0302 in #1774
- [KWS]Add kws example on HeySnips dataset. by @KPatr1ck in #1558
- [text][server]add text punc server by @honei in #1772
- [ASR] fix asr cli infer by @Jackwaterveg in #1770
- [vec] add GE2E to support unlabeled data training by @qingen in #1731
- [ASR] fix time restricion in test_cli.sh by @Jackwaterveg in #1777
- [ASR] Replace fbank by @Jackwaterveg in #1776
- [CLI] add color for test_cli by @Jackwaterveg in #1778
- [speechx] add sucess log in run.sh by @SmileGoat in #1779
- [KWS]Update KWS example. by @KPatr1ck in #1783
- [server] update readme by @lym0302 in #1782
- [Doc] Update ds2online model info by @Jackwaterveg in #1781
- [CLI] renew ds2 online model by @Jackwaterveg in #1786
- [speechx] fix speechx ws server to return dummpy partial result by @zh794390558 in #1787
- [asr][server]asr client add punctuatjion server by @honei in #1784
- [asr] patch func to var by @zh794390558 in #1788
- [asr][server]fix client parse the asr result bug by @honei in #1789
- [Bug fix] fix test_cli by @Jackwaterveg in #1794
- [vec] update readme by @qingen in #1796
- [R1.0]update the streaming output and punc default ip, port by @honei in #1800
- Renew ds2 online model [cer 6.66%] by @Jackwaterveg in #1802
- [R1.0] update the streaming asr server readme by @honei in #1810
- [R1.0] Renew ds2 online doc info by @Jackwaterveg in #1809
- [server] update streaming demos readme by @lym0302 in #1806
- [R1.0]update the paddlespeech_client asr_online cli by @honei in #1818
- [r1.0][doc] fix readme by @zh794390558 in #1825
New Contributors
- @iftaken made their first contribution in #1672
- @ccrrong made their first contribution in #1651
- @buchongyu2 made their first contribution in #1694
Acknowledgements
Special thanks to @zh794390558 @honei @Jackwaterveg @lym0302 @qingen @GT-ZhangAcer @yt605155624 @WilliamZhang06 @SmileGoat @ccrrong
Full Changelog: r0.2.0...r1.0.0a