Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backend: (x86) Extend x86 AVX instructions to single-precision #3887

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

qaco
Copy link
Collaborator

@qaco qaco commented Feb 11, 2025

No description provided.

@qaco qaco added dialects Changes on the dialects backend Compiler backend in xDSL labels Feb 11, 2025
@qaco qaco self-assigned this Feb 11, 2025
Copy link

codecov bot commented Feb 11, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 91.26%. Comparing base (941bf19) to head (9fd350a).
Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3887      +/-   ##
==========================================
+ Coverage   91.25%   91.26%   +0.01%     
==========================================
  Files         466      466              
  Lines       57945    57981      +36     
  Branches     5575     5575              
==========================================
+ Hits        52879    52919      +40     
+ Misses       3635     3626       -9     
- Partials     1431     1436       +5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@qaco qaco marked this pull request as draft February 11, 2025 16:32
@qaco qaco marked this pull request as ready for review February 11, 2025 16:33
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you please open a separate PR for these changes, with a change to the x86 roundtrip and assembly emission tests for this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after merging the other PRs I suggested in the other comments, could you please move this to tests/filecheck/projects/libxsmm? My feeling is that we don't need the rountrip test for the matmul since each individual op should already be handled by the dialect rountrip filecheck test.

%c3_tmp0 = x86.rm.vmovups %ymm7, %rcx, 96 : (!x86.avx2reg<ymm7>, !x86.reg<rcx>) -> !x86.avx2reg<ymm7>
// Load column 0 of B
%b_col_0 = x86.rm.vbroadcastss %ymm8, %rsi, 0 : (!x86.avx2reg<ymm8>, !x86.reg<rsi>) -> !x86.avx2reg<ymm8>
// Go brrr
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does this mean? ;)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha it describes the magic of a well-formed fmas sequence which is analoguous to the one of a money printer (should I say a performance printer ?). Sorry I used it when writing the assembly. Will remove when submitting it again.

@superlopuh
Copy link
Member

We use the convention of letting the original commenter resolve the comments on PRs, to make it easier to track conversations, to make sure that the point has been addressed.

@compor
Copy link
Collaborator

compor commented Feb 12, 2025

We use the convention of letting the original commenter resolve the comments on PRs, to make it easier to track conversations, to make sure that the point has been addressed.

maybe something to add in the PR template?

@qaco
Copy link
Collaborator Author

qaco commented Feb 12, 2025

We use the convention of letting the original commenter resolve the comments on PRs, to make it easier to track conversations, to make sure that the point has been addressed.

maybe something to add in the PR template?

Oh ok

@qaco qaco force-pushed the hugo/instructions_for_matmul branch from 86668e9 to dec2af3 Compare February 12, 2025 21:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend Compiler backend in xDSL dialects Changes on the dialects
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants