-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What do z_dim and c_dim stand for? #36
Comments
Hi, all these three variables are related with content encoder, z_dim denotes the dimension of acoustic units (z) in VQ codebook, c_dim denotes the dimension of continuous vectors after LSTM (g-net in the paper) that takes z as inputs, n_embeddings is the number of acoustic units in VQ codebook. |
Thank you! |
In model_encoder.py/class Encoder(nn.Module)/def forwad(self, mels): what does 128 mean?What variable does it represent? |
128 is the number of frames of mel-spectrograms used for training, it denotes 1.28s of waveform. |
Dear PHD:
Could you tell me what do z_dim:64 and c_dim:256 in config/model/default stand for?And what n_embeddings: 512 in config/model/default stand for?Thank you very much.
The text was updated successfully, but these errors were encountered: