-
Notifications
You must be signed in to change notification settings - Fork 597
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How do you calculate flops? #72
Comments
The paper says 29.05 Gflops for XL/4, and this other github issue says 28.8 Gflops. |
Try fvcore; it's a nice tool to calculate FLOPs. |
You can also refer to the closed issue: #14 |
For DiTXL/2, Latent Size = 32 x 32 x 3 For now let's focus only the pointwise feedforward network in a DiT block, FLOPs for a pointwise feedforward network ignoring FLOPs of GELU activation function is equal to, 21737373696 ~ 21.7374 B or GFLOPs ( MM([1024 x 1152] x [1152 x 4608]) + MM([1024 x 4608] x [4608 x 1152]) ) As we have 28 DiT blocks, total FLOPs of latent transformer is equal to, 21737373696 * 28 = 608646463488 ~ 608.6465 B or GFLOPs FLOPs DiTXL/2 - 256x256 - 2 (just pointwise feedforward networks in DiT blocks) = 608.6465 B or GFLOPs Reported in Paper FLOPs DiTXL/2 - 256x256 - 2 = 118.64 B or GFLOPs From #14 FLOPs DiTXL/2 - 256x256 - 2 = 118.64 x 2 = 237.28 B or GFLOPs I think FLOPs calculation are way off in the paper. Please let me know if I have made any mistakes in my calculation. |
I want know how do you calculate the flops? I can't get the same flops on your paper by thop.
The text was updated successfully, but these errors were encountered: