Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ST layer scales down input and produces wrong output during random iterations #14

Open
swarnakr opened this issue Jul 31, 2017 · 3 comments

Comments

@swarnakr
Copy link

Hi,
(1) I found that the ST layer always scales down input such that it looks like a transformed image on a black canvas.
(2) When the batch size is large (>300) the transformed output contains negative values in alternate iterations. I found this extremely bizarre.

Any help with these issues will help! I'm running on a deadline so a quick response would also be much appreciated.

@daerduoCarey
Copy link
Owner

Hi, @swarnakr,

For (1), you could add additional loss to make the scaling factor in the transformation matrix big. It's possible that ST layer will generate black padding around the images. My experiment shows that this will not affect the final classification on MNIST and CUB dataset.
For (2), sorry that I have no idea. It is irregular to have batch size larger than 64 to processing images I think. Never encounter this problem. I think there should be no negative values if your inputs are all non-negative since ST layer is just doing interpolation.

Bests,
Kaichun

@swarnakr
Copy link
Author

Thanks for your response.

Following up on (2), I find that almost all the affine parameters predicted by the localisation network results in very bizarre transformations (like negative x,y coordinates).

How do you make sure that the affine coordinates are proper? I'm not sure that there is a simple loss function that can ensure this.

Also, in the original paper by Zisserman et al, they don't mention using any loss on thetas, so do you have any idea what the difference in their implementation might be?

Many thanks,
Swarna

@daerduoCarey
Copy link
Owner

Hi, @swarnakr

Thank you for your interest in my code.
I'm actually not sure about how the original authors produce their results. But I guess the following rules should help:

  1. Make the learning rate for the localization network smaller than the regular. Make it 1e-2 ~ 1e-3 times smaller will help the prediction to change slowly.
  2. You may add some penalty loss for the magnitude of the transformation to make the transformation small and smooth.

Thanks.
Kaichun

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants