Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIE2P] Instruction select G_SELECT. #291

Merged
merged 2 commits into from
Jan 31, 2025
Merged

Conversation

SagarMaheshwari99
Copy link
Collaborator

No description provided.

Copy link
Collaborator

@konstantinschwarz konstantinschwarz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few suggestions to refactor some code, overall this look ok

def : Pat<(vec512Ty (select (i32 eRS8:$rs1), VEC512:$rs2, VEC512:$rs3)),
(vec512Ty (VSEL_32 VEC512:$rs2, VEC512:$rs3, (ADD_add_r_ri eR:$rs1, (i32 -1))))>;
}
foreach vec1024Ty = [v128i8, v64i16, v32i32] in {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we adapt the legalization rules for G_SELECT and clamp to 512-bit vector types?
Then we only need to support the 512-bit patterns

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its a nice idea.

May be I have one more suggestion, since we only allow 512-bit vector types. Can we get rid of explicitly converting accumulator to vector by bitcasts. And constrain G_SELECT to vector register banks in the regbank selection.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we are just delaying this action from legalizer to regbankselect?
I think nothing should pass legalizer, if we don't intend it to be legal.

@SagarMaheshwari99 SagarMaheshwari99 force-pushed the sagarm.gselect branch 2 times, most recently from aaa8426 to 63d335f Compare January 24, 2025 12:38
Copy link
Collaborator

@niwinanto niwinanto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change looks good. Few nits and a suggestion.

llvm/lib/Target/AIE/AIE2LegalizerInfo.cpp Outdated Show resolved Hide resolved
def : Pat<(vec512Ty (select (i32 eRS8:$rs1), VEC512:$rs2, VEC512:$rs3)),
(vec512Ty (VSEL_32 VEC512:$rs2, VEC512:$rs3, (ADD_add_r_ri eR:$rs1, (i32 -1))))>;
}
foreach vec1024Ty = [v128i8, v64i16, v32i32] in {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its a nice idea.

May be I have one more suggestion, since we only allow 512-bit vector types. Can we get rid of explicitly converting accumulator to vector by bitcasts. And constrain G_SELECT to vector register banks in the regbank selection.

llvm/lib/Target/AIE/aie2p/AIE2PLegalizerInfo.cpp Outdated Show resolved Hide resolved
llvm/lib/Target/AIE/AIE2LegalizerInfo.cpp Outdated Show resolved Hide resolved

const Register NewDstReg = MRI.createGenericVirtualRegister(NewVecTy);
MIRBuilder.buildInstr(MI.getOpcode(), {NewDstReg},
{SrcReg0, NewSrcReg1, NewSrcReg2}, MI.getFlags());
Copy link
Collaborator

@martien-de-jong martien-de-jong Jan 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check: SrcReg0 is false or true, which correctly defines a vector condition for element 0, the only one we're interested in. (Perhaps give an outline of this strategy in a comment at the start of the function)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

G_SELECT only gets a 0/1 value as condition, and selects either SrcReg1 if condition is true, i.e. 1, or SrcReg2 if condition is false, i.e. 0.

AIE's instruction works slightly differently:

Each bit in sel selects either from xsrc0 (value zero) or xsrc1 (value one).

Hence we need to translate the 0/1 bit to a bitmask in instruction selection.
We do this using cond - 1. Which will result in a zero mask if cond is true, or an all 1 mask (-1) if cond is false.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for my confusion. We just legalize to the vector size by padding, we don't map to VSEL here.
But note that you could leave the condition as it is for vector sizes <= 32, because we have 8, 16 and 32 bit vsel.

@SagarMaheshwari99 SagarMaheshwari99 force-pushed the sagarm.gselect branch 2 times, most recently from f1b8457 to 68c7849 Compare January 29, 2025 16:23

const Register NewDstReg = MRI.createGenericVirtualRegister(NewVecTy);
MIRBuilder.buildInstr(MI.getOpcode(), {NewDstReg},
{SrcReg0, NewSrcReg1, NewSrcReg2}, MI.getFlags());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

G_SELECT only gets a 0/1 value as condition, and selects either SrcReg1 if condition is true, i.e. 1, or SrcReg2 if condition is false, i.e. 0.

AIE's instruction works slightly differently:

Each bit in sel selects either from xsrc0 (value zero) or xsrc1 (value one).

Hence we need to translate the 0/1 bit to a bitmask in instruction selection.
We do this using cond - 1. Which will result in a zero mask if cond is true, or an all 1 mask (-1) if cond is false.

# RUN: llc -mtriple aie2p -run-pass=regbankselect -regbankselect-greedy %s -verify-machineinstrs -o - | FileCheck --check-prefix=FAST %s
# (c) Copyright 2024-2025 Advanced Micro Devices, Inc. or its affiliates

# RUN: llc -mtriple aie2p -run-pass=legalizer,regbankselect -regbankselect-fast %s -verify-machineinstrs -o - | FileCheck --check-prefix=GREEDY %s
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to not run the legalizer in a regbankselect test.
We can simply remove the illegal cases from this test, as long as we cover them in the legalizer test

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok removed.

return QueryTy.isVector() && QueryTy.getSizeInBits() < Size;
};
}

LegalityPredicate
negatePredicate(const std::function<bool(const LegalityQuery &)> &Func) {
return [=](const LegalityQuery &Query) { return !Func(Query); };
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I wouldn't dare writing this. That's a reference to a function handle abstraction captured by value into a lambda somehow bound to a LegailityPredicate return value.

@@ -236,10 +243,24 @@ AIE2LegalizerInfo::AIE2LegalizerInfo(const AIE2Subtarget &ST) : AIEHelper(ST) {

getActionDefinitionsBuilder(G_SELECT)
.legalFor({{S32, S32}, {P0, S32}})
.clampScalar(1, S32, S32)
// AIE ISA supports only 512-bit vector select
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is AIE2/AIE2P I guess.

.bitcastIf(
[=](const LegalityQuery &Query) {
const LLT &ResTy = Query.Types[0];
return ResTy.isScalar() && ResTy.getSizeInBits() >= 256;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When do we see scalars >= 256?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For AIE2, we probably don't.

const LLT DstTy = MRI.getType(DstReg);

if (DstTy.isVector() && DstTy.getSizeInBits() < 512)
return legalizeG_SELECTLessThan512Bit(Helper, MI);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should parameterize this with the size limit and not have 512 in the name.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept it coz we only had 512bit "max" bit size possible, but I changed it now.


if (DstTy.isVector() && DstTy.getSizeInBits() < 512)
return legalizeG_SELECTWithSizeLimit(Helper, MI, 512);

assert(DstTy.isVector() && DstTy.getSizeInBits() == 2048 &&
Copy link
Collaborator

@andcarminati andcarminati Jan 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a feeling that you can drop all this 2048 handling if:

As result, .clampMaxNumElements(0, S32, 16) that is already there will do the job for us.

This will change your test in the following way:

     ; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(<64 x s32>) = COPY $dm1
     ; CHECK-NEXT: [[COPY2:%[0-9]+]]:_(<64 x s32>) = COPY $dm2
     ; CHECK-NEXT: [[UV:%[0-9]+]]:_(<32 x s32>), [[UV1:%[0-9]+]]:_(<32 x s32>) = G_UNMERGE_VALUES [[COPY1]](<64 x s32>)
-    ; CHECK-NEXT: [[UV2:%[0-9]+]]:_(<32 x s32>), [[UV3:%[0-9]+]]:_(<32 x s32>) = G_UNMERGE_VALUES [[COPY2]](<64 x s32>)
-    ; CHECK-NEXT: [[UV4:%[0-9]+]]:_(<16 x s32>), [[UV5:%[0-9]+]]:_(<16 x s32>) = G_UNMERGE_VALUES [[UV]](<32 x s32>)
-    ; CHECK-NEXT: [[UV6:%[0-9]+]]:_(<16 x s32>), [[UV7:%[0-9]+]]:_(<16 x s32>) = G_UNMERGE_VALUES [[UV2]](<32 x s32>)
-    ; CHECK-NEXT: [[SELECT:%[0-9]+]]:_(<16 x s32>) = G_SELECT [[ASSERT_ZEXT]](s32), [[UV4]], [[UV6]]
-    ; CHECK-NEXT: [[SELECT1:%[0-9]+]]:_(<16 x s32>) = G_SELECT [[ASSERT_ZEXT]](s32), [[UV5]], [[UV7]]
+    ; CHECK-NEXT: [[UV2:%[0-9]+]]:_(<16 x s32>), [[UV3:%[0-9]+]]:_(<16 x s32>) = G_UNMERGE_VALUES [[UV]](<32 x s32>)
+    ; CHECK-NEXT: [[UV4:%[0-9]+]]:_(<16 x s32>), [[UV5:%[0-9]+]]:_(<16 x s32>) = G_UNMERGE_VALUES [[UV1]](<32 x s32>)
+    ; CHECK-NEXT: [[UV6:%[0-9]+]]:_(<32 x s32>), [[UV7:%[0-9]+]]:_(<32 x s32>) = G_UNMERGE_VALUES [[COPY2]](<64 x s32>)
+    ; CHECK-NEXT: [[UV8:%[0-9]+]]:_(<16 x s32>), [[UV9:%[0-9]+]]:_(<16 x s32>) = G_UNMERGE_VALUES [[UV6]](<32 x s32>)
+    ; CHECK-NEXT: [[UV10:%[0-9]+]]:_(<16 x s32>), [[UV11:%[0-9]+]]:_(<16 x s32>) = G_UNMERGE_VALUES [[UV7]](<32 x s32>)
+    ; CHECK-NEXT: [[SELECT:%[0-9]+]]:_(<16 x s32>) = G_SELECT [[ASSERT_ZEXT]](s32), [[UV2]], [[UV8]]
+    ; CHECK-NEXT: [[SELECT1:%[0-9]+]]:_(<16 x s32>) = G_SELECT [[ASSERT_ZEXT]](s32), [[UV3]], [[UV9]]
+    ; CHECK-NEXT: [[SELECT2:%[0-9]+]]:_(<16 x s32>) = G_SELECT [[ASSERT_ZEXT]](s32), [[UV4]], [[UV10]]
+    ; CHECK-NEXT: [[SELECT3:%[0-9]+]]:_(<16 x s32>) = G_SELECT [[ASSERT_ZEXT]](s32), [[UV5]], [[UV11]]
     ; CHECK-NEXT: [[CONCAT_VECTORS:%[0-9]+]]:_(<32 x s32>) = G_CONCAT_VECTORS [[SELECT]](<16 x s32>), [[SELECT1]](<16 x s32>)
-    ; CHECK-NEXT: [[UV8:%[0-9]+]]:_(<16 x s32>), [[UV9:%[0-9]+]]:_(<16 x s32>) = G_UNMERGE_VALUES [[UV1]](<32 x s32>)
-    ; CHECK-NEXT: [[UV10:%[0-9]+]]:_(<16 x s32>), [[UV11:%[0-9]+]]:_(<16 x s32>) = G_UNMERGE_VALUES [[UV3]](<32 x s32>)
-    ; CHECK-NEXT: [[SELECT2:%[0-9]+]]:_(<16 x s32>) = G_SELECT [[ASSERT_ZEXT]](s32), [[UV8]], [[UV10]]
-    ; CHECK-NEXT: [[SELECT3:%[0-9]+]]:_(<16 x s32>) = G_SELECT [[ASSERT_ZEXT]](s32), [[UV9]], [[UV11]]
     ; CHECK-NEXT: [[CONCAT_VECTORS1:%[0-9]+]]:_(<32 x s32>) = G_CONCAT_VECTORS [[SELECT2]](<16 x s32>), [[SELECT3]](<16 x s32>)
     ; CHECK-NEXT: [[CONCAT_VECTORS2:%[0-9]+]]:_(<64 x s32>) = G_CONCAT_VECTORS [[CONCAT_VECTORS]](<32 x s32>), [[CONCAT_VECTORS1]](<32 x s32>)
     ; CHECK-NEXT: $dm0 = COPY [[CONCAT_VECTORS2]](<64 x s32>)

What do you think?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have made the changes.

Copy link
Collaborator

@konstantinschwarz konstantinschwarz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@SagarMaheshwari99 SagarMaheshwari99 enabled auto-merge (rebase) January 31, 2025 17:43
@SagarMaheshwari99 SagarMaheshwari99 enabled auto-merge (rebase) January 31, 2025 17:43
@SagarMaheshwari99 SagarMaheshwari99 merged commit dc6ec0b into aie-public Jan 31, 2025
8 checks passed
@konstantinschwarz konstantinschwarz deleted the sagarm.gselect branch January 31, 2025 18:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants