You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello,
It takes 25 seconds to generate three seconds (sample_rate 22050, about 15 words) audio. Do you have a good idea for performance optimization?We can discuss it. Thank you.
The text was updated successfully, but these errors were encountered:
Yeah I get about 0.225x real-time with the 16kHz model. There are a number of tricks you can try to get improved speeds. You could probably apply most of the optimizations from the WaveRNN paper. Specifically, you'd need to implement:
a single persistent GPU operation for sampling.
structured sparcity.
subscale sampling.
Unfortunately, I don't have much time to work on these optimizations but I'd be happy to accept and review any pull requests if you're interested in working on it.
Hello,
It takes 25 seconds to generate three seconds (sample_rate 22050, about 15 words) audio. Do you have a good idea for performance optimization?We can discuss it. Thank you.
The text was updated successfully, but these errors were encountered: