Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issue about MultiheadAttention #1

Open
starhiking opened this issue Mar 24, 2022 · 7 comments
Open

issue about MultiheadAttention #1

starhiking opened this issue Mar 24, 2022 · 7 comments

Comments

@starhiking
Copy link

starhiking commented Mar 24, 2022

Hi, Great Work in face alignment!
However, i have a question about the params and flops of the paper.

I have tried to run your code to count the params and flops for 6-layer and 12-layer model.
And I guess your result is coming from the thop tool, but it exists shortage in MultiheadAttention, which accounts major part of Transformer. So your result in the paper may be wrong.
Could you check the issue, and update the real flops if indeed exists the error.

@Jiahao-UTS
Copy link
Owner

Jiahao-UTS commented Mar 24, 2022

Thanks for your reminder. Because our tokens are sparse, only 29 to 98, the MultiheadAttention actually accounts a very small part of our model. We utilize another toolkits (fvcore) to count the params and flops, the flops are 6.123G, 5.173G and 3.988G for 98 landmarks, 68 landmarks and 29 landmarks respectively. I will update the results.

Moreover, we find the main issue that affect the the inference speed is the interpolation code is not efficient. I modify the interpolation code yesterday, the inference speed is improved 1.5×. I will update the code after testing. Thank you for finding this issue.

@starhiking
Copy link
Author

The nice work breaks the new record in face alignment,and I want to cite your work in my paper.

I have calculated the params and flops on WFLW are 6.110G and 13.134M for the 6-layer model, as well as 8.138G and 19.445M for the 12-layer model. Could you check its correctness or give detailed information for 6-layer and 12 layer models.

@Jiahao-UTS
Copy link
Owner

I think your results are correct. Could you please give me your email? I will send the details to through email and can discuss more details via Wechat.

@starhiking
Copy link
Author

你好,刚给论文提供的邮箱发了邮件,但是邮箱没有响应,不知道你收到没有

@starhiking
Copy link
Author

Hi, I want to cite your result about 12-layer model on the WFLW subsets.
Do you have any data that has been tested?

@Jiahao-UTS
Copy link
Owner

Hi, I want to cite your result about 12-layer model on the WFLW subsets. Do you have any data that has been tested?

            testset        largepose        expression        illumination        makeup        occlusion        blur

NME 4.128 6.988 4.368 4.023 4.032 5.005 4.790
FR 2.72 11.96 1.59 2.15 1.94 5.70 3.88
AUC 0.596 0.349 0.573 0.603 0.608 0.520 0.537

@starhiking
Copy link
Author

Hi, I want to cite your result about 12-layer model on the WFLW subsets. Do you have any data that has been tested?

            testset        largepose        expression        illumination        makeup        occlusion        blur

NME 4.128 6.988 4.368 4.023 4.032 5.005 4.790 FR 2.72 11.96 1.59 2.15 1.94 5.70 3.88 AUC 0.596 0.349 0.573 0.603 0.608 0.520 0.537

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants