getting a list of all speakers and the P#? #206

vastator69 · 2024-02-06T19:23:11Z

vastator69
Feb 6, 2024

Is there a way to list out all the possible speaker P#'s and maybe hear what they sound like? I tried the utility that created a bunch of sample wavs with names but when I try to use the speaker name it never works. when using the P# does work every time but wanted to get a list of what is possible. I looked around but did not find any docs on this or I most likely overlooked it. TIA

aedocw · 2024-02-06T20:58:22Z

aedocw
Feb 6, 2024
Maintainer

This is a great question, you did not miss anything and it's not at all obvious unless you get into the docs for Coqui TTS. In fact I think it's not even that clear how speakers and voice models are connected.

First, there are basically two models available (though technically you can use any model that works with Coqui TTS, I only ever test/validate with the two I find to be the best). Those models are VITS and XTTSv2. epub2tts defaults to the model tts_models/en/vctk/vits if there are no XTTS-specific options specified. In this case, the speaker names take the form of p###.

You can find all available speaker IDs for VITS from the command line with tts --model_name "tts_models/en/vctk/vits" --list_speaker_idxs

I added the following script to the utils directory that will generate samples for vits speakers:

import torch
from TTS.api import TTS

if torch.cuda.is_available():
    device = "cuda"
else:
    device = "cpu"

tts = TTS("tts_models/en/vctk/vits").to(device)
speakers = []
for x in tts.speakers:
    if x.startswith('p'):
        speakers.append(x)

for speaker in speakers:
    text = "My name is " + speaker + ". It took me quite a long time to develop a voice, and now that I have it I'm not going to be silent."
    output = f"speaker-{speaker}.wav"
    print("Generating " + output)
    tts.tts_to_file(text=text, speaker=speaker, file_path=output)

To use the speakers that have actual names (these were Coqui Studio voices), you need to be using XTTS, and the command would look like epub2tts mybook.epub --engine xtts --speaker "Damien Black"

You can run the script generate-speaker-samples.py to get samples, or look at that script to see how they are generated.

Hope this helps, let me know if you still have questions or are running into any problems.

0 replies

larry77 · 2024-02-22T21:25:55Z

larry77
Feb 22, 2024

Hello, and thanks for this wonderful project! I hope someone will fork and continue the development of TTS. I have just started to use it and I love it. One newbie question: how does the command

epub2tts mybook.txt --engine xtts --speaker "Damien Black" --cover cover-image.jpg --sayparts

should be modified for handling text in Italian and in French? Thanks!

0 replies

aedocw · 2024-02-22T22:12:50Z

aedocw
Feb 22, 2024
Maintainer

You can just add --language it or --language fr to that line.

Be aware though that non-english languages do not always turn out great. There have been a few bugs opened here regarding issues with other languages (like #153 for instance). There is some more work to be done on this, especially around limiting the length of sentence that gets sent for TTS.

1 reply

larry77 Feb 23, 2024

Thanks. I had a look at the discussion you mentioned. I have another question and I can open a separate discussion if you want. The installation on debian stable was not a straightforward process due to conflicts/requirements of the packages installed with pip. How do I update the epub2tts? do I need to reclone the repo? Or is there somewhere a script for that?

aedocw · 2024-02-23T14:41:53Z

aedocw
Feb 23, 2024
Maintainer

I have not tried an installation on Debian, but ideally if the installation happens in a virtual environment, there shouldn't be any conflicts. I am planning to update the installation instructions to use pipenv which should be a huge improvement in terms of ease of installation.

I also realize the README doesn't really say how to update other than in the windows instructions, I can't believe I missed that!

The way to update is activate your virtual environment, then git pull in the repo, and then pip install . --upgrade. That will get you an updated version of epub2tts. I'll add an issue to make sure I add that to the instructions, thanks for pointing that out.

2 replies

larry77 Feb 23, 2024

Very appreciated. If I may ask, please document step by step what needs to be done to manage the virtual environment for pip on Linux -- I always get in trouble there and I cannot be the only one. Keep up the good work.

aedocw Feb 23, 2024
Maintainer

Feel free to add any comments to #212 :)

caosegueda · 2024-05-20T03:59:24Z

caosegueda
May 20, 2024

I hope someone might find this useful, it appears the voices for the P# voices (Vits Encoding Trained on Vctk Dataset) can be found at this site. At least they sound the same to me :-).
https://aimodels.org/ai-models/text-to-speech-synthesis/english-tts-model-109-voices-vits-encoding-trained-on-vctk-dataset-at-22050hz/

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

getting a list of all speakers and the P#? #206

{{title}}

Replies: 5 comments 3 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

getting a list of all speakers and the P#? #206

vastator69 Feb 6, 2024

Replies: 5 comments · 3 replies

aedocw Feb 6, 2024 Maintainer

larry77 Feb 22, 2024

aedocw Feb 22, 2024 Maintainer

larry77 Feb 23, 2024

aedocw Feb 23, 2024 Maintainer

larry77 Feb 23, 2024

aedocw Feb 23, 2024 Maintainer

caosegueda May 20, 2024

vastator69
Feb 6, 2024

Replies: 5 comments 3 replies

aedocw
Feb 6, 2024
Maintainer

larry77
Feb 22, 2024

aedocw
Feb 22, 2024
Maintainer

aedocw
Feb 23, 2024
Maintainer

aedocw Feb 23, 2024
Maintainer

caosegueda
May 20, 2024