Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIE2P] Enable S20 Narrowing for FIFO Loads and Stores #333

Open
wants to merge 1 commit into
base: aie-public
Choose a base branch
from

Conversation

abhinay-anubola
Copy link
Collaborator

Added FIFO load and store intrinsics as S20 consumers in S20Narrowing
Added and updated the tests for the same

case Intrinsic::aie2p_fifo_ld_pop_576_1d_bfp16:
case Intrinsic::aie2p_fifo_ld_pop_576_2d_bfp16:
case Intrinsic::aie2p_fifo_ld_pop_576_3d_bfp16:
case Intrinsic::aie2p_fifo_st_flush_2d:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We now have more intrinsics available, maybe double check they are all present

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering whether all scalar operands of these instructions are accurately qualified as 'S20'. It at least seems likely that they don't consume more than 20 bits.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See:

%4 = tail call { ptr, <32 x i32>, i32, <64 x i8>, <8 x i8> } @llvm.aie2p.fifo.ld.pop.576.1d.bfp16(ptr %0, <32 x i32> %1, i32 %2, i20 %3)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering whether all scalar operands of these instructions are accurately qualified as 'S20

The availability register is indeed an s32, and we should leave it as is.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-opening: I think @martien-de-jong 's comment needs to be addressed.

This function should only return true for actual s20 operands. FIFO ld/st have an s32 availability operand, isNativeS20ConsumerIntrinsic should return false for this one. Maybe change the prototype to something like isNativeS20ConsumerIntrinsic(unsigned IntrinsicID, unsigned OperandIdx)?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the actual Operand extraction happens in getOperandsToNarrow, where @llvm.aie2p.fifo.ld.pop.576.1d.bfp16 would not provide the i32 operand as a i20 operand to handle. Here we just check if the MI has Operands that are i20 and could profit from the combine.

default:
case TargetOpcode::G_INTRINSIC_W_SIDE_EFFECTS: {
const unsigned IntrinsicID = cast<GIntrinsic>(Use).getIntrinsicID();
if (!isNativeS20ConsumerIntrinsic(IntrinsicID)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice you factored that out!

@abhinay-anubola abhinay-anubola force-pushed the sanubola.s20narrowing.fifo.ld.st branch 3 times, most recently from f526965 to cf0e8ab Compare February 6, 2025 10:55
@abhinay-anubola abhinay-anubola force-pushed the sanubola.s20narrowing.fifo.ld.st branch from cf0e8ab to 287837f Compare February 7, 2025 06:49
; CHECK-NEXT: mov m0, r0
; CHECK-NEXT: mov p3, p2
; CHECK-NEXT: mov dn0, r1
; CHECK-NEXT: mov dj0, r2
; CHECK-NEXT: mov p2, p4
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We seem to have more pointer movs now. Do you understand why?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants