-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TensorFlow Implementation of SPSA seems to calculate gradient incorrectly #1087
Comments
Hey, thanks! Yes, the code should read as you've proposed - otherwise, the updates will move in the right direction, but by the wrong amount. Do you want to go ahead at making the change? The provided learning rate defaults should also be adjusted, so that default behavior is the same before/after the change. For backwards-compatibility, I would add a new parameter, say I'm also happy to fix this, though it will probably be a week until I can get to it. |
Sure, I can go ahead and make the change. That sounds good about preserving backwards compatibility. Just to make sure I got the math right, the new learning rate should be |
Great, thanks! Yes, that checks out to me. |
@juesato I was working on fixing this issue today, but I realized that since SPSA is using Adam as an optimizer, it should be agnostic to the magnitude of the gradient. So we can't fix it by scaling the learning rate, but it shouldn't matter anyways. Does that sound right to you? |
In the SPSAAdam class in
cleverhans/attacks/spsa.py
, the gradient is calculated with the following code:I believe the last line should instead be
or equivalently
to match the description in the paper. Without this, the gradient is multiplied by delta, which is by default 0.01. Thus, the gradient is underestimated by a factor of 100. I confirmed this while working on the PyTorch implementation, which I changed to use sign(delta_x); it then estimated the gradients correctly.
This issue is also present in
cleverhans/future/tf2/attacks/spsa.py
. It looks like it was originally implemented by @juesato.The text was updated successfully, but these errors were encountered: