You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There is something confusing in the paper, although it's in overall greatly written.
2.1 Generator architecture, it states that additional noise is not fed in the generator since it does not affect the perceptual quality of the result
2.3 Training objective, eq (1) and (2) state G(s,z) meaning that Generator processes both spectrogram and gaussian noise vector
If you inject noise, where is that happening please ?
This brings me to an other question. I read GAN-TTS, they inject noise when conditioning hidden activations wrt. speaker embedding. And they use conditional discriminator to ensure that the audio is both realistic and accurately conditioned.
If there is neither noise in MelGAN, nor conditional discriminator, how do you assess that the generator is learning and generalizing for the conditional generation please ?
The text was updated successfully, but these errors were encountered:
Hello,
There is something confusing in the paper, although it's in overall greatly written.
2.1 Generator architecture, it states that additional noise is not fed in the generator since it does not affect the perceptual quality of the result
2.3 Training objective, eq (1) and (2) state G(s,z) meaning that Generator processes both spectrogram and gaussian noise vector
If you inject noise, where is that happening please ?
This brings me to an other question. I read GAN-TTS, they inject noise when conditioning hidden activations wrt. speaker embedding. And they use conditional discriminator to ensure that the audio is both realistic and accurately conditioned.
If there is neither noise in MelGAN, nor conditional discriminator, how do you assess that the generator is learning and generalizing for the conditional generation please ?
The text was updated successfully, but these errors were encountered: