Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GPU] Add pattern to fuse tensor.extract_slice into forall producer #19296

Merged
merged 2 commits into from
Jan 28, 2025

Conversation

Max191
Copy link
Contributor

@Max191 Max191 commented Nov 25, 2024

This PR adds a pattern to fuse a consumer tensor.extract_slice into a producer scf.forall op. The transform is added to FuseAndHoistParallelLoops, where it helps to fuse tensor.unpack ops with extract_slice semantics into producer loops. This is needed when targeting MFMA intrinsics for unaligned shapes, and also in generating code for unset encoding ops on GPU. This is a follow up to #19295, which has the complementing pattern for collapse_shape.

The PR also adds a transform op to keep the long lit tests separate from the FuseAndHoistParallelLoop tests.

@Max191 Max191 force-pushed the extract-slice-forall-fusion branch 2 times, most recently from 2f5056e to 8f9be22 Compare November 26, 2024 18:30
@Max191 Max191 marked this pull request as ready for review November 26, 2024 18:31
@Max191
Copy link
Contributor Author

Max191 commented Nov 26, 2024

This is based on #19295. Please only review the last commit.

Edit: rebased now.

@Max191 Max191 force-pushed the extract-slice-forall-fusion branch from 8f9be22 to f46540f Compare November 26, 2024 19:02
@Max191 Max191 force-pushed the extract-slice-forall-fusion branch from f46540f to 72b7b62 Compare November 26, 2024 19:55
@Max191 Max191 force-pushed the extract-slice-forall-fusion branch 2 times, most recently from 37f21ba to 1885572 Compare December 4, 2024 21:46
@Max191 Max191 force-pushed the extract-slice-forall-fusion branch from 1885572 to 105d13a Compare January 24, 2025 20:34
@Max191
Copy link
Contributor Author

Max191 commented Jan 24, 2025

@qedawkins gentle ping. I rebased this PR now that #19295 has been merged, and it is ready for review.

@Max191 Max191 force-pushed the extract-slice-forall-fusion branch from 105d13a to 20d988c Compare January 27, 2025 15:52
Copy link
Contributor

@qedawkins qedawkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice work! Also don't feel obligated to add transform ops, they do add extra code to maintain.

Max Dawkins and others added 2 commits January 28, 2025 08:46
@Max191 Max191 force-pushed the extract-slice-forall-fusion branch from 20d988c to 66c2381 Compare January 28, 2025 16:00
@Max191 Max191 merged commit 6a5c12e into iree-org:main Jan 28, 2025
41 checks passed
ita9naiwa pushed a commit to ita9naiwa/iree that referenced this pull request Feb 4, 2025
…ree-org#19296)

This PR adds a pattern to fuse a consumer tensor.extract_slice into a
producer scf.forall op. The transform is added to
FuseAndHoistParallelLoops, where it helps to fuse tensor.unpack ops with
extract_slice semantics into producer loops. This is needed when
targeting MFMA intrinsics for unaligned shapes, and also in generating
code for unset encoding ops on GPU. This is a follow up to
iree-org#19295, which has the complementing
pattern for collapse_shape.

The PR also adds a transform op to keep the long lit tests separate from
the FuseAndHoistParallelLoop tests.

---------

Signed-off-by: Max Dawkins <[email protected]>
Signed-off-by: Max Dawkins <[email protected]>
Co-authored-by: Max Dawkins <[email protected]>
Signed-off-by: Hyunsung Lee <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants