-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Result remains little noise, but loss does not decrease #14
Comments
Hi @Approximetal, Quick question: are those samples from out-of-domain speakers or were those speakers in the training data? |
They are in the training dataset. |
Hmmm, how large is the dataset (in hours)? Also how much data is there for each speaker? I'm not sure how much impact dataset size has but it may be a good first thing to investigate. Otherwise, a good option might be to try finetuning the pretrained model on your dataset? |
I use four dataset training together, including VCTK-Corpus, VCC2020, a Chinese Dataset about 24h and a multi-lingual audiobook dataset about 150h. |
Sorry about the delay @Approximetal. That's a lot of data so I don't think finetuning is necessary. Since you have a big dataset my guess is that the large variation in speakers, recording conditions, and background noise causes the output distribution over next audio samples to be flatter. Sampling from the distribution could then introduce some noise. I'm not sure how to get rid of the noise completely. One option is to try a larger model. If you have time to experiment you could increasing If you try any of these ideas or get better results some other way please let me know. |
Thank you for your advice, I will try it later, and update my result to you. Actually I tried to increase the number of layers and the dimension of the network, but the loss didn't decrease, I haven't found the reason. |
Thanks for the update @Approximetal, My guess is that adding more layers might result in vanishing gradients. Just changing the width of the layers might work though. |
Sorry I made a mistake, I mean when I comment out the padding in inference generation step, the noise disappeared. BTW, do I only need to adjust I tried two parameters:
and
But the loss didn't decrease at all... @bshall |
Hi @Approximetal, That's weird. I'll look into those parameters and get back to you shortly. The changes you made are correct so I'm not sure what the problem is. |
Hi, I use your method training on my own dataset, for 1000k iterations, it sounds stable, have only a little background noise. But the loss maintains around 2.6, and the noise didn't disappear after another 1000k steps. I have tried to reduce the batchsize to 2 and learning rate 5e-5, but it doesn't work. How can I deal with it?
samples.zip
The text was updated successfully, but these errors were encountered: