XiaoiceSing2

The source code for the paper XiaoiceSing2 (interspeech2023)

Demo page

Notice

I am busy with job-hunting now. I will update other modules, including the HiFi-WaveGAN after my final decision.

Implementation (developping)

fastspeech2-based generator
discriminator group, including segment discriminators and detail discriminators
ConvFFT block

Dataset and preparation

Kaldi style preparation

wav.scp
utt2spk
spk2utt
text

./run.sh --start-stage 1 --stop-stage 1 # extract melspectrogram, f0, energy, and statistical value

Training

./run.sh --start-stage 2 --stop-stage 2

Real and generated melspectrogram (145600 training steps)

Real(left) XiaoiceSing(middle) XiaoiceSing2(right)

L2 loss curve for melspectrogram

L2 loss before post-processing(left) L2 loss after post-processing(right)

Inference

./run.sh --start-stage 3 --stop-stage 3

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
configs		configs
dataset		dataset
lexicon		lexicon
loss		loss
models		models
modules		modules
pics		pics
preprocess		preprocess
pyutils		pyutils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
run.sh		run.sh
train.py		train.py
train_gan.py		train_gan.py
utils		utils

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

XiaoiceSing2

Notice

Implementation (developping)

Dataset and preparation

Training

Real and generated melspectrogram (145600 training steps)

L2 loss curve for melspectrogram

Inference

About

Releases

Packages

Languages

License

zengchang233/xiaoicesing2

Folders and files

Latest commit

History

Repository files navigation

XiaoiceSing2

Notice

Implementation (developping)

Dataset and preparation

Training

Real and generated melspectrogram (145600 training steps)

L2 loss curve for melspectrogram

Inference

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages