PaddleSpeech r1.1.0

yt605155624 released this 19 Aug 10:58

· 7 commits to r1.1 since this release

S2T

Add wer tools. #1709
Add optimize attention cache used for attention ; 0-dim tensor for model export. #2124
Fix cnn cache dy2st shape. #2168

TTS

Fix random speaker embedding bug in voice clone. #1828 by @jerryuhoo
Add VITS model. #1855 #1957 #2040
Add kunlun support for speedyspeech. #1879 by @QingshuChen
Normalize wav max value to 1 in preprocess. #1887 by @jerryuhoo
Remove fluid dependence in TTS. #1940
Add onnx models for aishell3/ljspeech/vctk's tts3/voc1/voc5. #2068
Add TTS static/onnx models in pretrained_models.py. #2074
Add Ernie SAT model. #2052 #2117
Add Chinese English mixed TTS frontend. #2143
Add Chinese English mixed TTS example. #2234
Fix English text frontend bug. #2235 by @david-95
Add g2pW to Chinese frontend. #2230 by @BarryKCL
Fix text frontend bugs. #1912 #2250 #2254 #2255 #2272

Speechx

add custom asr script. #1946
refactor frontend. #2003
deepspeech2 to onnx #2034
Refactor audio/data/feature cache. #1638
Frontend refactor . #1640
Fix nnet itf header. #1641
Refactor speech egs. #1707
Refactor egs and more egs for TLG wfst graph build. #1715
Speedup ngram building . #1729
Update speechx install doc. #1736
Fix nnet input and output name. #1740
Update wfst graph. #1742
Fix model params path name. #1750
Remove fluid tools for onnx export. #2116

Audio

Refactor paddleaudio to paddlespeech.audio. #2007
Add webdataset in paddlespeech.audio. #2062

Server

Remove extra logs. #2111 #2113
Change streaming tts servers' fs from 24k to models' fs. #2121
Fix bug in engine_warmup. #2171 by @Betterman-qs
Replace default vocoder in seerver to mb_melgan. #2214
Fix bug in streaming_asr_server with punctuation restoration. #2244
Rename time_s and time_ns to time_b and time_nb. #2133
More accuracy decoding somthing. #2128

CLI

Add paddlespeech.resource module. #1917
Dynamic cli commands registration. #1959
Fix unnecessary download. #2103
Remove extra logs. #2084 #2085 #2107
Add Chinese English mixed TTS CLI. #2249
Add onnxruntime infer for CLI. #2222

Demo

Add speech web demo. #2039 #2080
Add kws cli and demo. #2063
Use paddle web for streaming asr. #2105
add custom asr script #1946
More cli for speech demos. #2138

Doc

Add API doc. #2075
Format tts doc string for read the docs. #2115

Others

Fix CPU Dockerfile. #2172 by @BrightXiaoHan
Add PaddleSpeech Dockerfile for hard mode of installation. #2127 by @buchongyu2

Acknowledgements

Special thanks to @buchongyu2 @BrightXiaoHan @BarryKCL @Betterman-qs @david-95 @jerryuhoo @QingshuChen @iftaken @zh794390558 @Jackwaterveg @lym0302 @SmileGoat @yt605155624

New Contributors

@QingshuChen made their first contribution in #1879
@Zhangjingyu06 made their first contribution in #1951
@ryanrussell made their first contribution in #1976
@freeliuzc made their first contribution in #2044
@vpegasus made their first contribution in #2043
@dependabot made their first contribution in #2061
@raycool made their first contribution in #2109
@YDX-2147483647 made their first contribution in #2125
@chenkui164 made their first contribution in #2130
@0x45f made their first contribution in #2162
@Doubledongli made their first contribution in #2167
@Betterman-qs made their first contribution in #2171
@BrightXiaoHan made their first contribution in #2172
@THUzyt21 made their first contribution in #2202
@david-95 made their first contribution in #2235
@BarryKCL made their first contribution in #2230

Full Changelog: r1.0.0...r1.1.0

Contributors

ryanrussell, QingshuChen, and 22 other contributors

Assets 2

0 Join discussion