-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Attention head implementation is different from the paper #13
Comments
The code of model comes from the official codebase(https://github.com/dong03/MVSS-Net) so I did not notice it before. But I check the code and the first layer and the residual connection in DAHead are indeed different from which are mentioned in the paper. |
I noticed that too in the original repo, but since the original developers don't respond, I instead chose to share this information with you. |
I have gone through the code for the attention head, and it seem to me that it is wildly different from what is described in the paper. It starts with 3x3x1024 convolutions that take up over 50% of all model's parameters. The whole thing is bizzare, and includes even 1x1x1024 convolutions at the end of both sub-heads. Also the residual connection from the branch outputs is missing.
An illustration that shows the difference:
Maybe this is the reason for the non-reproducible results? I don't think it is the case, but I would be curious to find out.
The text was updated successfully, but these errors were encountered: