Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dropout behavior? #8

Open
Yuliang-Zou opened this issue Mar 19, 2017 · 3 comments
Open

dropout behavior? #8

Yuliang-Zou opened this issue Mar 19, 2017 · 3 comments

Comments

@Yuliang-Zou
Copy link

Yuliang-Zou commented Mar 19, 2017

Hi, I am replicating the code with PyTorch.

But I am not sure about the dropout behavior here. Seems that you apply dropout after all layers of the Discriminator (i.e. conv -> dropout -> bn -> dropout -> leaky relu -> dropout etc.), is that correct?

Also, do you apply any preprocessing on the input data? Seems that you rescale it to [0, 1]?

Thanks!

@edgarriba
Copy link

@Yuliang-Zou I have also an implementation in Pytorch. However, still I'm not able to make it training since the reconstructed images are pure noise. Please, take a look maybe you find something.

https://github.com/edgarriba/ali-pytorch

ping @vdumoulin

@vdumoulin
Copy link
Collaborator

@Yuliang-Zou @edgarriba Sorry about the delay! The way dropout is added to the network is indeed a bit hard to parse.

What this block of code does is it first retrieves the symbolic variables to the inputs of the layers identified in the list (lines 126-129) and applies dropout to them via graph replacement (line 131).

The layers in the list correspond to

  • the first convolutional layer of the x discriminator subnetwork (ali.discriminator.x_discriminator.layers[0]),
  • all subsequent convolutional layers of the x discriminator subnetwork (ali.discriminator.x_discriminator.layers[2::3]),
  • all convolutional layers of the z discriminator subnetwork (ali.discriminator.z_discriminator.layers[::2]), and
  • all convolutional layers of the joint discriminator (ali.discriminator.joint_discriminator.layers[::2]).

To express it more compactly, dropout is applied to the input of all convolutions in the discriminator.

Another thing which may be confusing is that the 0.2 value in the call to apply_dropout refers to the dropout probability, not the keep probability (a quick look tells me that this is coherent with the PyTorch interface, but it's still useful to keep in mind nonetheless).

I hope this clears things up! Please don't hesitate to reply with further questions if you're still having trouble.

@edgarriba
Copy link

@vdumoulin yep, pytorch uses the same convention

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants