-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lf0 question about convert phase #34
Comments
Hi, normalizing f0 aims to remove the speaker characteristics. During preprocessing phase, f0 is not normalized, but during training and inference, f0 is normalized as shown below: Line 53 in 851b4f5
Line 57 in 851b4f5
|
The perplexity should be increasing during training, as higer perplexity indicates that the vectors in the VQ codebook are distinguiable and can be used to represent different acoustic units. I also saw your recon_loss is high. Based on my experience, recon_loss should be less than 0.5, then you would obtain good converted samples. |
Hi,
I wonder why you normalize f0 series before feeding to the f0encoder in convert.py.
However, this kind of normalization for f0 isn't used in preprocessing phase.
The text was updated successfully, but these errors were encountered: