-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reflow failure #17
Comments
Hi, most of the hyperparameters are specified in Can you specify which dataset are you using for not being able to rectify the flow? |
I'm also working on LJSpeech dataset, but I'm trying to implement the algorithm based on Grad-TTS's framework and data-preprocessing code, and I use the original frequency 22.05kHz . Unexpectedly the model collapsed after about 50 epochs of rectification. Did you keep generating multiple noise-mel pairs for each utterance for rectification? |
This sounds weird to me as I never experienced such problems before.
No, I just generated a new dataset which was equally sized, 1 generated utterance for each sentence. Those generated samples are kept still (as if it were an off-the-shelf dataset). No re-generation was performed. |
Several weeks ago I figured out that there was a bug in the inference pipeline of my code. Many thanks to your patient answers! |
I've been trying to reproduce your work, especially the rectified flow part. However, the reflow procedure always results in poorer synthesis quality (even for small sampling steps). I'm wondering if you could provide some of your hyperparameters used in the reflow procedure, like training epochs and ema decay rate?
The text was updated successfully, but these errors were encountered: