From de946cc184456e73dcd4ad869dbbe49538b06bb3 Mon Sep 17 00:00:00 2001 From: Dapwner <46859435+Dapwner@users.noreply.github.com> Date: Wed, 5 Jun 2024 15:52:49 +0800 Subject: [PATCH] Update README.md --- README.md | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 7c900cf..f6af4e6 100644 --- a/README.md +++ b/README.md @@ -10,12 +10,20 @@ Samples link: https://amaai-lab.github.io/Accented-TTS-MLVAE-ADV/ This code is built upon Comprehensive-TTS: https://github.com/keonlee9420/Comprehensive-Transformer-TTS ## Training -First download your dataset and preprocess the audio data into mel spectrogram `.npy` arrays with the `preprocess.py script`. We used L2CMU in this paper, which stands for a combination of L2Arctic (24 speakers) and CMUArctic (4 speakers). Then run ``CUDA_VISIBLE_DEVICES=X python train.py --dataset L2CMU`` +First download your dataset (L2Arctic / and CMUArctic) and preprocess the audio data into mel spectrogram `.npy` arrays with the `preprocess.py script`. We used L2CMU in this paper, which stands for a combination of L2Arctic (24 speakers) and CMUArctic (4 speakers). Then run ``CUDA_VISIBLE_DEVICES=X python train.py --dataset L2CMU`` ## Inference Once trained, you can run `extract_stats.py` to retrieve the accent and speaker embeddings of your evaluation set and store them. Then, you can synthesize with one of the synth scripts. :-) -Once trained, you can run ``CUDA_VISIBLE_DEVICES=X python synthesize.py --dataset L2Arctic --restore_step [N] --mode [batch/single] --text [TXT] --speaker_id [SPID] --accent [ACC]`` +Once trained, to generate (accent-converted / non-converted) speech, you can run +```bash +CUDA_VISIBLE_DEVICES=X python synthesize.py --dataset L2Arctic --restore_step [N] --mode [batch/single] --text [TXT] --speaker_id [SPID] --accent [ACC] +``` +SPID = ABA, ASI, NCC,... speaker ID from the L2Arctic dataset + +ACC = Arabic, Chinese, Hindi, Korean, Spanish, Vietnamese (accents from L2Arctic) + +Unfortunately, we do not provide a trained model as of now. ## BibTeX citation ```