08 Jun 08:57

erogol

d245b5d

v0.0.15.1

Fix manifest file to include VERSION for pypi distribution.

Assets 2

19 May 12:09

erogol

v0.0.14

5482a0f

v0.0.14

🐸 v0.0.14

🐞Bug Fixes

Remove breaking line from Tacotron models. (👑 @a-froghyra)

💾 Code updates

BREAKING: Coqpit integration for config management and the first 🐸TTS recipe, for LJSpeech Check #476.

Every model now tied to a Python class that defines the configuration scheme. It provides a better interface and lets the user know better what are the default values, expected value types, and mandatory fields.

Specific model configs are defined under TTS/tts/configs and TTS/vocoder/configs. TTS/config/shared_configs.py hosts configs that are shared by all the 🐸 TTS models. Configs shared by tts models are hosted under TTS/tts/configs/shared_configs.py and shared by vocoder models are under TTS/vocoder/configs/shared_config.py.

For example TacotronConfig follows BaseTrainingConfig -> BaseTTSConfig -> TacotronConfig.

BREAKING: Remove phonemizer support due to License conflict.

This essentially deprecates the support for all the models using phonemes as input. Feel free to suggest in-place options if you are affected by this change.

Start hosting 👩‍🍳 recipes under 🐸 TTS. The first recipe is for Tacotron2-DDC with LJspeech dataset under TTS/recipes/.

Please check here for more details.

Add extract_tts_spectrograms.py that supports GlowTTS and Tacotron1-2. (👑 @Edresson)
Add version.py (👑 @chmodsss)

Assets 2

32 Join discussion

29 Apr 07:17

erogol

v0.0.13

f02f033

v0.0.13

🐸 v0.0.13

🐞Bug Fixes

💾 Code updates

SpeakerManager class for handling multi-speaker model management and interfacing speaker.json file.
Enabling multi-speaker models with tts and tts-server endpoints. (:crown: @kirianguiller )
Allow choosing a different noise scale for GlowTTS at inference.
Glow-TTS updates to import SC-Glow Models.
Fixing windows support (:crown: @WeberJulian )

🚶‍♀️ Operational Updates

Refactoring 🐸 TTS installation and allow selecting different scopes (all, tf, notebooks)for installation depending on the specific needs.

🏅 Model implementations

🚀 New Pre-Trained Model Releases

SC-GlowTTS multi-speaker English model from our work https://arxiv.org/abs/2104.05557 (:crown: @Edresson )
HiFiGAN vocoder finetuned for the above model.
Tacotron DDC Non-Binary English model using Accenture's Sam dataset.
HiFiGAN vocoder trained for the models above.

Released Models

💡 All the models below are available by tts or tts-server endpoints on CLI as explained here.

Models with ✨️ below are new with this release.

SC-GlowTTS model is from our latest paper in a collaboration with @Edresson and @mueller91.
The new non-binary TTS model is trained using the SAM dataset from Accenture Labs. Check out their blog post

Language	Dataset	Model Name	Model Type	TTS version	Download
✨ English (non-binary)	sam (acccenture)	Tacotron2-DDC	tts	😄 v0.0.13	💾
✨ English (multi-speaker)	VCTK	SC-GlowTTS	tts	😄 v0.0.13	💾
English	LJSpeech	Tacotron-DDC	tts	v0.0.12	💾
German	Thorsten-DE	Tacotron-DCA	tts	v0.0.11	💾
German	Thorsten-DE	Wavegrad	vocoder	v0.0.11	💾
English	LJSpeech	SpeedySpeech	tts	v0.0.10	💾
English	EK1	Tacotron2	tts	v0.0.10	💾
Dutch	MAI	TacotronDDC	tts	v0.0.10	💾
Chinese	Baker	TacotronDDC-GST	tts	v0.0.10	💾
English	LJSpeech	TacotronDCA	tts	v0.0.9	💾
English	LJSpeech	Glow-TTS	tts	v0.0.9	💾
Spanish	M-AILabs	TacotronDDC	tts	v0.0.9	💾
French	M_AILabs	TacotronDDC	tts	v0.0.9	💾
Dutch	MAI	TacotronDDC	tts	v0.0.10	💾
✨ English	sam (accenture)	HiFiGAN	vocoder	😄 v0.0.13	💾
✨ English	VCTK	HiFiGAN	vocoder	😄 v0.0.13	💾
English	LJSpeech	HiFiGAN	vocoder	v0.0.12	💾
English	EK1	WaveGrad	vocoder	v0.0.10	💾
Dutch	MAI	ParallelWaveGAN	vocoder	v0.0.10	💾
English	LJSpeech	MB-MelGAN	vocoder	v0.0.9	💾
🌍 Multi-Lang	LibriTTS	FullBand-MelGAN	vocoder	v0.0.9	💾
🌍 Multi-Lang	LibriTTS	WaveGrad	vocoder	v0.0.9	💾

Update Jun 7 2021: Ruslan (Russian) model has been removed due to the license conflict.

Assets 4

2 Join discussion

15 Apr 15:02

erogol

v0.0.12

a53958a

v0.0.12

🐸 v0.0.12

🐞Bug Fixes

fix #419 (This is a crucial bug fix).
fix #408

💾 Code updates

Enable logging model config.json on Tensorboard. #418
Update code style standards and use a Makefile to ease regular tasks. #423
Enable using Tacotron.prenet.dropout at inference time. This leads to a better quality with some models.
Update default tts model to LJspeech TacotronDDC.
Show the real waveform on Tensorboard in GAN vocoder training.

🚶‍♀️ Operational Updates

🏅 Model implementations

initial HiFiGAN implementation (:crown: @rishikksh20 @erogol) #422

🚀 New Pre-Trained Model Releases

~~Universal HifiGAN model~~(postponed to the next version for 👑 @Edresson's updated model.)
LJSpeech, Tacotron2 Double Decoder Consistency v2 model.
Check our blog post to learn more about Double Decoder Consistency.
LJSpeech HifiGAN model.

Released Models

💡 All the models below are available by tts end point as explained here.

Language	Dataset	Model Name	Model Type	TTS version	Download
✨ English	LJSpeech	Tacotron-DDC	tts	😃 v0.0.12	💾
German	Thorsten-DE	Tacotron-DCA	tts	v0.0.11	💾
German	Thorsten-DE	Wavegrad	vocoder	v0.0.11	💾
English	LJSpeech	SpeedySpeech	tts	v0.0.10	💾
English	EK1	Tacotron2	tts	v0.0.10	💾
Dutch	MAI	TacotronDDC	tts	v0.0.10	💾
Chinese	Baker	TacotronDDC-GST	tts	v0.0.10	💾
English	LJSpeech	TacotronDCA	tts	v0.0.9	💾
English	LJSpeech	Glow-TTS	tts	v0.0.9	💾
Spanish	M-AILabs	TacotronDDC	tts	v0.0.9	💾
French	M_AILabs	TacotronDDC	tts	v0.0.9	💾
Dutch	MAI	TacotronDDC	tts	v0.0.10	💾
✨ English	LJSpeech	HiFiGAN	vocoder	😃 v0.0.12	💾
English	EK1	WaveGrad	vocoder	v0.0.10	💾
Dutch	MAI	ParallelWaveGAN	vocoder	v0.0.10	💾
English	LJSpeech	MB-MelGAN	vocoder	v0.0.9	💾
🌍 Multi-Lang	LibriTTS	FullBand-MelGAN	vocoder	v0.0.9	💾
🌍 Multi-Lang	LibriTTS	WaveGrad	vocoder	v0.0.9	💾

Assets 7

02 Apr 14:43

erogol

v0.0.11

2344379

v0.0.11

🐸 v0.0.11

🐞Bug Fixes

Fixed #374. (Thx for reporting @a-froghyar )

💾 Code updates

/bin/resample.py to resample wavefiles (:crown: @WeberJulian)
Some updates for Windows compat. (:crown: @GuyPaddock)
Fixing CheckSpectrogram notebook. (:crown: @GuyPaddock)
Fix #392

🚶‍♀️ Operational Updates

🏅 Model implementations

initial AlignTTS implementation. (#398)
initial HiFiGAN implementation (:crown: @rishikksh20) (postponed to the next release)

🚀 New Pre-Trained Model Releases

German - Tacotron2-DCA trained with thorsten_dataset. (:crown: @thorstenMueller )
German - Wavegrad vocoder with thorsten_dataset. (:crown: @thorstenMueller)

Released Models

💡 All the models below are available by tts end point as explained here.

Language	Dataset	Model Name	Model Type	TTS version	Download
✨ German	Thorsten-DE	Tacotron-DCA	tts	😃 v0.0.11	💾
✨ German	Thorsten-DE	Wavegrad	vocoder	😃 v0.0.11	💾
English	LJSpeech	SpeedySpeech	tts	v0.0.10	💾
English	EK1	Tacotron2	tts	v0.0.10	💾
Dutch	MAI	TacotronDDC	tts	v0.0.10	💾
Chinese	Baker	TacotronDDC-GST	tts	v0.0.10	💾
English	LJSpeech	TacotronDCA	tts	v0.0.9	💾
English	LJSpeech	Glow-TTS	tts	v0.0.9	💾
Spanish	M-AILabs	TacotronDDC	tts	v0.0.9	💾
French	M_AILabs	TacotronDDC	tts	v0.0.9	💾
Dutch	MAI	TacotronDDC	tts	v0.0.10	💾
English	EK1	WaveGrad	vocoder	v0.0.10	💾
Dutch	MAI	ParallelWaveGAN	vocoder	v0.0.10	💾
English	LJSpeech	MB-MelGAN	vocoder	v0.0.9	💾
🌍 Multi-Lang	LibriTTS	FullBand-MelGAN	vocoder	v0.0.9	💾
🌍 Multi-Lang	LibriTTS	WaveGrad	vocoder	v0.0.9	💾

Contributors

GuyPaddock, rishikksh20, and 3 other contributors

Assets 4

10 Mar 17:11

erogol

v0.0.10

4cc162a

v0.0.10

🐸 v0.0.10

🐞Bug Fixes

Make synthesizer.py saving the output audio with the vocoder sampling rate. It is necessary if there is sampling rates of the tts and the vocoder models are different and interpolation is applied to the tts model output before running the vocoder. Practically, it fixes generated Spanish and French voices by tts or tts-server on the terminal.
Handling utf-8 on Windows. (by @adonispujols)
Fix Loading the last model when --continue_training. It was loading the best_model regardless.

💾 Code updates

🚶‍♀️ Operational Updates

Move released models to Github Releases and deprecate GDrive being the first option.

🏅 Model implementations

No updates 😓

🚀 New Pre-Trained Model Releases

English ek1 - Tacotron2 model and WaveGrad vocoder under .models.json. (huge THX!! to @nmstoker)
Russian Ruslan - Tacotron2-DDC model.
Dutch model. (huge THX!! to @r-dh )
Chinese Tacotron2 model. (huge THX!! to @kirianguiller)
English LJSpeech - SpeechSpeech with WaveNet decoder.

Released Models

💡 All the models below are available by tts end point as explained here.

Language	Dataset	Model Name	Model Type	TTS version	Download
English	LJSpeech	SpeedySpeech	tts	😃 v0.0.10	💾
English	EK1	Tacotron2	tts	😃 v0.0.10	💾
Dutch	MAI	TacotronDDC	tts	😃 v0.0.10	💾
Chinese	Baker	TacotronDDC-GST	tts	😃 v0.0.10	💾
English	LJSpeech	TacotronDCA	tts	v0.0.9	💾
English	LJSpeech	Glow-TTS	tts	v0.0.9	💾
Spanish	M-AILabs	TacotronDDC	tts	v0.0.9	💾
French	M_AILabs	TacotronDDC	tts	v0.0.9	💾
Dutch	MAI	TacotronDDC	tts	😃 v0.0.10	💾
English	EK1	WaveGrad	vocoder	😃 v0.0.10	💾
Dutch	MAI	ParallelWaveGAN	vocoder	😃 v0.0.10	💾
English	LJSpeech	MB-MelGAN	vocoder	v0.0.9	💾
🌍 Multi-Lang	LibriTTS	FullBand-MelGAN	vocoder	v0.0.9	💾
🌍 Multi-Lang	LibriTTS	WaveGrad	vocoder	v0.0.9	💾

Assets 8

09 Mar 15:51

erogol

v0.0.9

45068a9

v0.0.9

🐸 TTS v0.0.9 - the first release 🎉

This is the first and v0.0.9 release of 🐸TTS.
🐸TTS is still an evolving project and any upcoming release might be significantly different and not backward compatible.

In this release, we provide the following models.

Language	Dataset	Model Name	Model Type	Download
English	LJSpeech	TacotronDCA	tts	💾
English	LJSpeech	Glow-TTS	tts	💾
Spanish	M-AILabs	TacotronDDC	tts	💾
French	M_AILabs	TacotronDDC	tts	💾
English	LJSpeech	MB-MelGAN	vocoder	💾
🌍 Multi-Lang	LibriTTS	FullBand-MelGAN	vocoder	💾
🌍 Multi-Lang	LibriTTS	WaveGrad	vocoder	💾

Notes

Multi-Lang vocoder models are intended for non-English models.
Vocoder models are independently trained from the tts models with possibly different sampling rates. Therefore, the performance is not optimal.
All models are trained with phonemes generated by espeak back-end (not espeak-ng).
This release has been tested under Python 3.6, 3.7, and 3.8. It is strongly suggested to use conda to install the dependencies and set up the environment.

Edit:

(22.03.2021) - Fullband Universal Vocoder is corrected with the right model files. Previously, we released the wrong model with that name.

Assets 9

Releases: coqui-ai/TTS

v0.0.15.1

v0.0.15

🐸 v0.0.15

🐞Bug Fixes

💾 Code updates

🚶‍♀️ Operational Updates

🏅 Model implementations

🚀 New Pre-Trained Model Releases

v0.0.14

🐸 v0.0.14

🐞Bug Fixes

💾 Code updates

v0.0.13

🐸 v0.0.13

🐞Bug Fixes

💾 Code updates

🚶‍♀️ Operational Updates

🏅 Model implementations

🚀 New Pre-Trained Model Releases

Released Models

v0.0.12

🐸 v0.0.12

🐞Bug Fixes

💾 Code updates

🚶‍♀️ Operational Updates

🏅 Model implementations

🚀 New Pre-Trained Model Releases

Released Models

v0.0.11

🐸 v0.0.11

🐞Bug Fixes

💾 Code updates

🚶‍♀️ Operational Updates

🏅 Model implementations

🚀 New Pre-Trained Model Releases

Released Models

Contributors

v0.0.10

🐸 v0.0.10

🐞Bug Fixes

💾 Code updates

🚶‍♀️ Operational Updates

🏅 Model implementations

🚀 New Pre-Trained Model Releases

Released Models

v0.0.9

🐸 TTS v0.0.9 - the first release 🎉

Notes

Edit: