-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
backend: (x86) Extend x86 AVX instructions to single-precision #3887
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #3887 +/- ##
==========================================
+ Coverage 91.25% 91.26% +0.01%
==========================================
Files 466 466
Lines 57945 57981 +36
Branches 5575 5575
==========================================
+ Hits 52879 52919 +40
+ Misses 3635 3626 -9
- Partials 1431 1436 +5 ☔ View full report in Codecov by Sentry. |
xdsl/dialects/x86/ops.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you please open a separate PR for these changes, with a change to the x86 roundtrip and assembly emission tests for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after merging the other PRs I suggested in the other comments, could you please move this to tests/filecheck/projects/libxsmm
? My feeling is that we don't need the rountrip test for the matmul since each individual op should already be handled by the dialect rountrip filecheck test.
%c3_tmp0 = x86.rm.vmovups %ymm7, %rcx, 96 : (!x86.avx2reg<ymm7>, !x86.reg<rcx>) -> !x86.avx2reg<ymm7> | ||
// Load column 0 of B | ||
%b_col_0 = x86.rm.vbroadcastss %ymm8, %rsi, 0 : (!x86.avx2reg<ymm8>, !x86.reg<rsi>) -> !x86.avx2reg<ymm8> | ||
// Go brrr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does this mean? ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haha it describes the magic of a well-formed fmas sequence which is analoguous to the one of a money printer (should I say a performance printer ?). Sorry I used it when writing the assembly. Will remove when submitting it again.
We use the convention of letting the original commenter resolve the comments on PRs, to make it easier to track conversations, to make sure that the point has been addressed. |
maybe something to add in the PR template? |
Oh ok |
86668e9
to
dec2af3
Compare
No description provided.