-
Notifications
You must be signed in to change notification settings - Fork 214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How can I synthesize my own text to speech? #11
Comments
I was wondering this too. Iv successfully trained it and got good samples, but how do I tts using this output? |
Neural text-to-speech is most of the time done in two steps: feature prediction and voice synthesis. MelGAN is a synthesizer so to go from text to speech you would need to combine it with a model that converts text into mel-spectrograms. One such model is e.g. Tacotron2, have a look at: https://github.com/NVIDIA/tacotron2 |
hi,@ViktorIgeland. |
Hi @Wenqikry, |
@ViktorIgeland, |
So how can we use MelNet with the same performance, i.e. how can we reproduce the results of the paper. Do you know if this is possible? And then extend it to custom audio files? Do we have any information on how these mel-scale spectrograms are generated? Something we can reproduce and use in MelNet. |
@Wenqikry did you figure out a good way to produce mel spectrograms? |
@casperbh96 Sorry, I haven't found it yet |
@Wenqikry Have you tried https://github.com/Rayhane-mamah/Tacotron-2 or https://github.com/NVIDIA/tacotron2 to train log-mels? Combine with the Melgan? |
@Liujingxiu23 Sorry,i haven't tried... |
I trained the model well on a dataset, Anyone can help? |
@Mariaa98 if you figure it out let me know, i have tried with 3 different data scientists and none of them could get a functional TTS script from this. we ended up going with a different model. |
I have got some results by tacotron2 and melgan, I can figure out what the wav say , but it's not good as the demos |
No description provided.
The text was updated successfully, but these errors were encountered: