Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproducing the new StyleTTS2 demo #229

Open
notnodnod opened this issue Dec 8, 2024 · 1 comment
Open

Reproducing the new StyleTTS2 demo #229

notnodnod opened this issue Dec 8, 2024 · 1 comment

Comments

@notnodnod
Copy link

notnodnod commented Dec 8, 2024

Hi! I saw you added StyleTTS2 support yesterday, awesome! The demo in the readme looks really impressive! 😄

I would love to get that running locally as it seems much better for my use case than XTTSv2. I see you've committed the code for the demo here: https://github.com/KoljaB/RealtimeTTS/blob/master/tests/style_test.py

Could you please share how you got that script running? I've searched around on Google but can't find the "Nicole" model config, checkpoint, and reference audio anywhere. The original StyleTTS2 repo does have two different pretrained models, but they're split up into LJSpeech and LibriTTS. Is one of these suitable for use with RealtimeTTS as well?

Lastly, the readme doesn't specifically mention StyleTTS2 under Custom Installation or Engine Requirements, so am I right in thinking that realtimetts[core]>=0.4.19 is all that's needed?

Would highly appreciate any guidance you can offer! 🙏

Thanks again!

@KoljaB
Copy link
Owner

KoljaB commented Dec 8, 2024

Hey there,

I'll share a detailed explanation of how I set up the demo soon, I have just too much to do right now. I haven’t uploaded the Nicole checkpoint to Hugging Face yet, but that’s coming soon too. As for the LJSpeech and LibriTTS models, they aren’t the best options. You’ll likely get much better results by training your own model. Here’s an excellent tutorial on fine-tuning for StyleTTS2: YouTube Tutorial.

pip install realtimetts>=0.4.19 is all you need for RealtimeTTS side. You’ll also need to grab the StyleTTS2 repo and install all the dependencies (especially espeak-ng). If you’re on Windows, you’ll also need a compatible phonemizer, I think I used pip install espeak_phonemizer_windows for that.

  • PHONEMIZER_ESPEAK_PATH should point to the .exe file.
  • PHONEMIZER_ESPEAK_LIBRARY should point to the libespeak-ng.dll.

Both should include the full paths.

Hope this helps for now! 😊

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants