Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ValueTracking] Suport GEPs in matchSimpleRecurrence. #123518

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

fhahn
Copy link
Contributor

@fhahn fhahn commented Jan 19, 2025

Update matchSimpleRecurrence to also support GEPs. This allows inferring
larger alignments in a number of cases.

I noticed that we fail to infer alignments from calls when dropping
assumptions; inferring alignment from assumptions uses SCEV, if we drop
an assume for a aligned function return value, we fail to infer the
better alignment in InferAlignment without this patch.

For now, it is limited to cases where the source element type is i8.

It comes with a bit of a compile-time impact:

stage1-O3: +0.05%
stage1-ReleaseThinLTO: +0.04%
stage1-ReleaseLTO-g: +0.03%
stage1-O0-g: -0.04%
stage2-O3: +0.04%
stage2-O0-g: +0.02%
stage2-clang: +0.03%

https://llvm-compile-time-tracker.com/compare.php?from=a8c60790fd4f70a461113f0721bdb4a114ddf420&to=9a207c52e9c644691573a40ceb5b89a3c09ab609&stat=instructions:u

Copy link

github-actions bot commented Jan 19, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@fhahn fhahn force-pushed the perf/vt-gep-rec branch 2 times, most recently from f8140fd to a8e0e4a Compare January 20, 2025 21:45
@fhahn fhahn changed the title Remove redundant assumes [ValueTracking] Suport GEPs in matchSimpleRecurrence. Jan 20, 2025
@fhahn fhahn marked this pull request as ready for review January 20, 2025 22:01
@fhahn fhahn requested a review from nikic as a code owner January 20, 2025 22:01
@fhahn fhahn requested review from dtcxzyw and goldsteinn January 20, 2025 22:02
@llvmbot
Copy link
Member

llvmbot commented Jan 20, 2025

@llvm/pr-subscribers-llvm-analysis

Author: Florian Hahn (fhahn)

Changes

Update matchSimpleRecurrence to also support GEPs. This allows inferring
larger alignments in a number of cases.

I noticed that we fail to infer alignments from calls when dropping
assumptions; inferring alignment from assumptions uses SCEV, if we drop
an assume for a aligned function return value, we fail to infer the
better alignment in InferAlignment without this patch.

For now, it is limited to cases where the source element type is i8.

It comes with a bit of a compile-time impact:

stage1-O3: +0.05%
stage1-ReleaseThinLTO: +0.04%
stage1-ReleaseLTO-g: +0.03%
stage1-O0-g: -0.04%
stage2-O3: +0.04%
stage2-O0-g: +0.02%
stage2-clang: +0.03%

https://llvm-compile-time-tracker.com/compare.php?from=a8c60790fd4f70a461113f0721bdb4a114ddf420&to=9a207c52e9c644691573a40ceb5b89a3c09ab609&stat=instructions:u


Patch is 25.88 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/123518.diff

3 Files Affected:

  • (modified) llvm/include/llvm/Analysis/ValueTracking.h (+5-1)
  • (modified) llvm/lib/Analysis/ValueTracking.cpp (+47-10)
  • (added) llvm/test/Transforms/InferAlignment/gep-recurrence.ll (+574)
diff --git a/llvm/include/llvm/Analysis/ValueTracking.h b/llvm/include/llvm/Analysis/ValueTracking.h
index b4918c2d1e8a18..8b72e605342f14 100644
--- a/llvm/include/llvm/Analysis/ValueTracking.h
+++ b/llvm/include/llvm/Analysis/ValueTracking.h
@@ -1245,7 +1245,11 @@ bool matchSimpleRecurrence(const PHINode *P, BinaryOperator *&BO, Value *&Start,
                            Value *&Step);
 
 /// Analogous to the above, but starting from the binary operator
-bool matchSimpleRecurrence(const BinaryOperator *I, PHINode *&P, Value *&Start,
+bool matchSimpleRecurrence(const Instruction *I, PHINode *&P, Value *&Start,
+                           Value *&Step);
+
+/// Analogous to the above, but also supporting non-binary operators.
+bool matchSimpleRecurrence(const PHINode *P, Instruction *&BO, Value *&Start,
                            Value *&Step);
 
 /// Return true if RHS is known to be implied true by LHS.  Return false if
diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp
index 6e2f0ebde9bb6c..d9c2ce4df92e7c 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -1489,7 +1489,7 @@ static void computeKnownBitsFromOperator(const Operator *I,
   }
   case Instruction::PHI: {
     const PHINode *P = cast<PHINode>(I);
-    BinaryOperator *BO = nullptr;
+    Instruction *BO = nullptr;
     Value *R = nullptr, *L = nullptr;
     if (matchSimpleRecurrence(P, BO, R, L)) {
       // Handle the case of a simple two-predecessor recurrence PHI.
@@ -1553,6 +1553,7 @@ static void computeKnownBitsFromOperator(const Operator *I,
       case Instruction::Sub:
       case Instruction::And:
       case Instruction::Or:
+      case Instruction::GetElementPtr:
       case Instruction::Mul: {
         // Change the context instruction to the "edge" that flows into the
         // phi. This is important because that is where the value is actually
@@ -1571,12 +1572,21 @@ static void computeKnownBitsFromOperator(const Operator *I,
 
         // We need to take the minimum number of known bits
         KnownBits Known3(BitWidth);
+        if (BitWidth != getBitWidth(L->getType(), Q.DL)) {
+          assert(isa<GetElementPtrInst>(BO) &&
+                 "Bitwidth should only be different for GEPs.");
+          break;
+        }
         RecQ.CxtI = LInst;
         computeKnownBits(L, DemandedElts, Known3, Depth + 1, RecQ);
 
         Known.Zero.setLowBits(std::min(Known2.countMinTrailingZeros(),
                                        Known3.countMinTrailingZeros()));
 
+        // Don't apply logic below for GEPs.
+        if (isa<GetElementPtrInst>(BO))
+          break;
+
         auto *OverflowOp = dyn_cast<OverflowingBinaryOperator>(BO);
         if (!OverflowOp || !Q.IIQ.hasNoSignedWrap(OverflowOp))
           break;
@@ -1737,6 +1747,7 @@ static void computeKnownBitsFromOperator(const Operator *I,
           Known.resetAll();
       }
     }
+
     if (const IntrinsicInst *II = dyn_cast<IntrinsicInst>(I)) {
       switch (II->getIntrinsicID()) {
       default:
@@ -2270,7 +2281,7 @@ void computeKnownBits(const Value *V, const APInt &DemandedElts,
 /// always a power of two (or zero).
 static bool isPowerOfTwoRecurrence(const PHINode *PN, bool OrZero,
                                    unsigned Depth, SimplifyQuery &Q) {
-  BinaryOperator *BO = nullptr;
+  Instruction *BO = nullptr;
   Value *Start = nullptr, *Step = nullptr;
   if (!matchSimpleRecurrence(PN, BO, Start, Step))
     return false;
@@ -2308,7 +2319,7 @@ static bool isPowerOfTwoRecurrence(const PHINode *PN, bool OrZero,
     // Divisor must be a power of two.
     // If OrZero is false, cannot guarantee induction variable is non-zero after
     // division, same for Shr, unless it is exact division.
-    return (OrZero || Q.IIQ.isExact(BO)) &&
+    return (OrZero || Q.IIQ.isExact(cast<BinaryOperator>(BO))) &&
            isKnownToBeAPowerOfTwo(Step, false, Depth, Q);
   case Instruction::Shl:
     return OrZero || Q.IIQ.hasNoUnsignedWrap(BO) || Q.IIQ.hasNoSignedWrap(BO);
@@ -2317,7 +2328,7 @@ static bool isPowerOfTwoRecurrence(const PHINode *PN, bool OrZero,
       return false;
     [[fallthrough]];
   case Instruction::LShr:
-    return OrZero || Q.IIQ.isExact(BO);
+    return OrZero || Q.IIQ.isExact(cast<BinaryOperator>(BO));
   default:
     return false;
   }
@@ -2727,7 +2738,7 @@ static bool rangeMetadataExcludesValue(const MDNode* Ranges, const APInt& Value)
 /// Try to detect a recurrence that monotonically increases/decreases from a
 /// non-zero starting value. These are common as induction variables.
 static bool isNonZeroRecurrence(const PHINode *PN) {
-  BinaryOperator *BO = nullptr;
+  Instruction *BO = nullptr;
   Value *Start = nullptr, *Step = nullptr;
   const APInt *StartC, *StepC;
   if (!matchSimpleRecurrence(PN, BO, Start, Step) ||
@@ -3560,9 +3571,9 @@ getInvertibleOperands(const Operator *Op1,
     // If PN1 and PN2 are both recurrences, can we prove the entire recurrences
     // are a single invertible function of the start values? Note that repeated
     // application of an invertible function is also invertible
-    BinaryOperator *BO1 = nullptr;
+    Instruction *BO1 = nullptr;
     Value *Start1 = nullptr, *Step1 = nullptr;
-    BinaryOperator *BO2 = nullptr;
+    Instruction *BO2 = nullptr;
     Value *Start2 = nullptr, *Step2 = nullptr;
     if (PN1->getParent() != PN2->getParent() ||
         !matchSimpleRecurrence(PN1, BO1, Start1, Step1) ||
@@ -9199,6 +9210,17 @@ llvm::canConvertToMinOrMaxIntrinsic(ArrayRef<Value *> VL) {
 
 bool llvm::matchSimpleRecurrence(const PHINode *P, BinaryOperator *&BO,
                                  Value *&Start, Value *&Step) {
+  Instruction *I;
+  if (matchSimpleRecurrence(P, I, Start, Step)) {
+    BO = dyn_cast<BinaryOperator>(I);
+    if (BO)
+      return true;
+  }
+  return false;
+}
+
+bool llvm::matchSimpleRecurrence(const PHINode *P, Instruction *&BO,
+                                 Value *&Start, Value *&Step) {
   // Handle the case of a simple two-predecessor recurrence PHI.
   // There's a lot more that could theoretically be done here, but
   // this is sufficient to catch some interesting cases.
@@ -9208,7 +9230,7 @@ bool llvm::matchSimpleRecurrence(const PHINode *P, BinaryOperator *&BO,
   for (unsigned i = 0; i != 2; ++i) {
     Value *L = P->getIncomingValue(i);
     Value *R = P->getIncomingValue(!i);
-    auto *LU = dyn_cast<BinaryOperator>(L);
+    auto *LU = dyn_cast<Instruction>(L);
     if (!LU)
       continue;
     unsigned Opcode = LU->getOpcode();
@@ -9240,6 +9262,21 @@ bool llvm::matchSimpleRecurrence(const PHINode *P, BinaryOperator *&BO,
 
       break; // Match!
     }
+    case Instruction::GetElementPtr: {
+      if (LU->getNumOperands() != 2 ||
+          !cast<GetElementPtrInst>(L)->getSourceElementType()->isIntegerTy(8))
+        continue;
+
+      Value *LL = LU->getOperand(0);
+      Value *LR = LU->getOperand(1);
+      // Find a recurrence.
+      if (LL == P) {
+        // Found a match
+        L = LR;
+        break;
+      }
+      continue;
+    }
     };
 
     // We have matched a recurrence of the form:
@@ -9256,9 +9293,9 @@ bool llvm::matchSimpleRecurrence(const PHINode *P, BinaryOperator *&BO,
   return false;
 }
 
-bool llvm::matchSimpleRecurrence(const BinaryOperator *I, PHINode *&P,
+bool llvm::matchSimpleRecurrence(const Instruction *I, PHINode *&P,
                                  Value *&Start, Value *&Step) {
-  BinaryOperator *BO = nullptr;
+  Instruction *BO = nullptr;
   P = dyn_cast<PHINode>(I->getOperand(0));
   if (!P)
     P = dyn_cast<PHINode>(I->getOperand(1));
diff --git a/llvm/test/Transforms/InferAlignment/gep-recurrence.ll b/llvm/test/Transforms/InferAlignment/gep-recurrence.ll
new file mode 100644
index 00000000000000..f51875adcd862f
--- /dev/null
+++ b/llvm/test/Transforms/InferAlignment/gep-recurrence.ll
@@ -0,0 +1,574 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 2
+; RUN: opt < %s -passes=infer-alignment -S | FileCheck %s
+
+target datalayout = "p1:64:64:64:32"
+
+declare i1 @cond()
+
+define void @test_recur_i8_128(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_128
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    store i64 0, ptr [[IV]], align 128
+; CHECK-NEXT:    [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 128
+; CHECK-NEXT:    [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT:    br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret void
+;
+entry:
+  br label %loop
+
+loop:
+  %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+  store i64 0, ptr %iv, align 1
+  %iv.next = getelementptr inbounds i8, ptr %iv, i64 128
+  %c = call i1 @cond()
+  br i1 %c, label %loop, label %exit
+
+exit:
+  ret void
+}
+
+define void @test_recur_i8_128_no_inbounds(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_128_no_inbounds
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    store i64 0, ptr [[IV]], align 128
+; CHECK-NEXT:    [[IV_NEXT]] = getelementptr i8, ptr [[IV]], i64 128
+; CHECK-NEXT:    [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT:    br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret void
+;
+entry:
+  br label %loop
+
+loop:
+  %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+  store i64 0, ptr %iv, align 1
+  %iv.next = getelementptr i8, ptr %iv, i64 128
+  %c = call i1 @cond()
+  br i1 %c, label %loop, label %exit
+
+exit:
+  ret void
+}
+
+define void @test_recur_i8_64(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_64
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    store i64 0, ptr [[IV]], align 64
+; CHECK-NEXT:    [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 64
+; CHECK-NEXT:    [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT:    br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret void
+;
+entry:
+  br label %loop
+
+loop:
+  %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+  store i64 0, ptr %iv, align 1
+  %iv.next = getelementptr inbounds i8, ptr %iv, i64 64
+  %c = call i1 @cond()
+  br i1 %c, label %loop, label %exit
+
+exit:
+  ret void
+}
+
+define void @test_recur_i8_63(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_63
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    store i64 0, ptr [[IV]], align 1
+; CHECK-NEXT:    [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 63
+; CHECK-NEXT:    [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT:    br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret void
+;
+entry:
+  br label %loop
+
+loop:
+  %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+  store i64 0, ptr %iv, align 1
+  %iv.next = getelementptr inbounds i8, ptr %iv, i64 63
+  %c = call i1 @cond()
+  br i1 %c, label %loop, label %exit
+
+exit:
+  ret void
+}
+
+define void @test_recur_i8_32(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_32
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    store i64 0, ptr [[IV]], align 32
+; CHECK-NEXT:    [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 32
+; CHECK-NEXT:    [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT:    br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret void
+;
+entry:
+  br label %loop
+
+loop:
+  %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+  store i64 0, ptr %iv, align 1
+  %iv.next = getelementptr inbounds i8, ptr %iv, i64 32
+  %c = call i1 @cond()
+  br i1 %c, label %loop, label %exit
+
+exit:
+  ret void
+}
+
+define void @test_recur_i8_16(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_16
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    store i64 0, ptr [[IV]], align 16
+; CHECK-NEXT:    [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 16
+; CHECK-NEXT:    [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT:    br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret void
+;
+entry:
+  br label %loop
+
+loop:
+  %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+  store i64 0, ptr %iv, align 1
+  %iv.next = getelementptr inbounds i8, ptr %iv, i64 16
+  %c = call i1 @cond()
+  br i1 %c, label %loop, label %exit
+
+exit:
+  ret void
+}
+
+define void @test_recur_i8_8(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_8
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    store i64 0, ptr [[IV]], align 8
+; CHECK-NEXT:    [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 8
+; CHECK-NEXT:    [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT:    br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret void
+;
+entry:
+  br label %loop
+
+loop:
+  %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+  store i64 0, ptr %iv, align 1
+  %iv.next = getelementptr inbounds i8, ptr %iv, i64 8
+  %c = call i1 @cond()
+  br i1 %c, label %loop, label %exit
+
+exit:
+  ret void
+}
+
+define void @test_recur_i8_4(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_4
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    store i64 0, ptr [[IV]], align 4
+; CHECK-NEXT:    [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 4
+; CHECK-NEXT:    [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT:    br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret void
+;
+entry:
+  br label %loop
+
+loop:
+  %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+  store i64 0, ptr %iv, align 1
+  %iv.next = getelementptr inbounds i8, ptr %iv, i64 4
+  %c = call i1 @cond()
+  br i1 %c, label %loop, label %exit
+
+exit:
+  ret void
+}
+
+define void @test_recur_i8_2(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_2
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    store i64 0, ptr [[IV]], align 2
+; CHECK-NEXT:    [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 2
+; CHECK-NEXT:    [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT:    br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret void
+;
+entry:
+  br label %loop
+
+loop:
+  %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+  store i64 0, ptr %iv, align 1
+  %iv.next = getelementptr inbounds i8, ptr %iv, i64 2
+  %c = call i1 @cond()
+  br i1 %c, label %loop, label %exit
+
+exit:
+  ret void
+}
+
+define void @test_recur_i8_1(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_1
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    store i64 0, ptr [[IV]], align 1
+; CHECK-NEXT:    [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 1
+; CHECK-NEXT:    [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT:    br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret void
+;
+entry:
+  br label %loop
+
+loop:
+  %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+  store i64 0, ptr %iv, align 1
+  %iv.next = getelementptr inbounds i8, ptr %iv, i64 1
+  %c = call i1 @cond()
+  br i1 %c, label %loop, label %exit
+
+exit:
+  ret void
+}
+
+define void @test_recur_i8_unknown_step(ptr align 128 %dst, i64 %off) {
+; CHECK-LABEL: define void @test_recur_i8_unknown_step
+; CHECK-SAME: (ptr align 128 [[DST:%.*]], i64 [[OFF:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    store i64 0, ptr [[IV]], align 1
+; CHECK-NEXT:    [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 [[OFF]]
+; CHECK-NEXT:    [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT:    br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret void
+;
+entry:
+  br label %loop
+
+loop:
+  %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+  store i64 0, ptr %iv, align 1
+  %iv.next = getelementptr inbounds i8, ptr %iv, i64 %off
+  %c = call i1 @cond()
+  br i1 %c, label %loop, label %exit
+
+exit:
+  ret void
+}
+
+define void @test_recur_i8_step_known_multiple(ptr align 128 %dst, i64 %off) {
+; CHECK-LABEL: define void @test_recur_i8_step_known_multiple
+; CHECK-SAME: (ptr align 128 [[DST:%.*]], i64 [[OFF:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[UREM:%.*]] = urem i64 [[OFF]], 128
+; CHECK-NEXT:    [[C_UREM:%.*]] = icmp eq i64 [[UREM]], 0
+; CHECK-NEXT:    [[C_POS:%.*]] = icmp sge i64 [[OFF]], 0
+; CHECK-NEXT:    [[AND:%.*]] = and i1 [[C_UREM]], [[C_POS]]
+; CHECK-NEXT:    br i1 [[AND]], label [[LOOP:%.*]], label [[EXIT:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    store i64 0, ptr [[IV]], align 1
+; CHECK-NEXT:    [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 [[OFF]]
+; CHECK-NEXT:    [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT:    br i1 [[C]], label [[LOOP]], label [[EXIT]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret void
+;
+entry:
+  %urem = urem i64 %off, 128
+  %c.urem = icmp eq i64 %urem, 0
+  %c.pos = icmp sge i64 %off, 0
+  %and = and i1 %c.urem, %c.pos
+  br i1 %and, label %loop, label %exit
+
+loop:
+  %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+  store i64 0, ptr %iv, align 1
+  %iv.next = getelementptr inbounds i8, ptr %iv, i64 %off
+  %c = call i1 @cond()
+  br i1 %c, label %loop, label %exit
+
+exit:
+  ret void
+}
+
+define void @test_recur_i8_i16_128(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_i16_128
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    store i64 0, ptr [[IV]], align 1
+; CHECK-NEXT:    [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i16 128
+; CHECK-NEXT:    [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT:    br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret void
+;
+entry:
+  br label %loop
+
+loop:
+  %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+  store i64 0, ptr %iv, align 1
+  %iv.next = getelementptr inbounds i8, ptr %iv, i16 128
+  %c = call i1 @cond()
+  br i1 %c, label %loop, label %exit
+
+exit:
+  ret void
+}
+
+define void @test_recur_i8_i8_132(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_i8_132
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHEC...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Jan 20, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Florian Hahn (fhahn)

Changes

Update matchSimpleRecurrence to also support GEPs. This allows inferring
larger alignments in a number of cases.

I noticed that we fail to infer alignments from calls when dropping
assumptions; inferring alignment from assumptions uses SCEV, if we drop
an assume for a aligned function return value, we fail to infer the
better alignment in InferAlignment without this patch.

For now, it is limited to cases where the source element type is i8.

It comes with a bit of a compile-time impact:

stage1-O3: +0.05%
stage1-ReleaseThinLTO: +0.04%
stage1-ReleaseLTO-g: +0.03%
stage1-O0-g: -0.04%
stage2-O3: +0.04%
stage2-O0-g: +0.02%
stage2-clang: +0.03%

https://llvm-compile-time-tracker.com/compare.php?from=a8c60790fd4f70a461113f0721bdb4a114ddf420&amp;to=9a207c52e9c644691573a40ceb5b89a3c09ab609&amp;stat=instructions:u


Patch is 25.88 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/123518.diff

3 Files Affected:

  • (modified) llvm/include/llvm/Analysis/ValueTracking.h (+5-1)
  • (modified) llvm/lib/Analysis/ValueTracking.cpp (+47-10)
  • (added) llvm/test/Transforms/InferAlignment/gep-recurrence.ll (+574)
diff --git a/llvm/include/llvm/Analysis/ValueTracking.h b/llvm/include/llvm/Analysis/ValueTracking.h
index b4918c2d1e8a18..8b72e605342f14 100644
--- a/llvm/include/llvm/Analysis/ValueTracking.h
+++ b/llvm/include/llvm/Analysis/ValueTracking.h
@@ -1245,7 +1245,11 @@ bool matchSimpleRecurrence(const PHINode *P, BinaryOperator *&BO, Value *&Start,
                            Value *&Step);
 
 /// Analogous to the above, but starting from the binary operator
-bool matchSimpleRecurrence(const BinaryOperator *I, PHINode *&P, Value *&Start,
+bool matchSimpleRecurrence(const Instruction *I, PHINode *&P, Value *&Start,
+                           Value *&Step);
+
+/// Analogous to the above, but also supporting non-binary operators.
+bool matchSimpleRecurrence(const PHINode *P, Instruction *&BO, Value *&Start,
                            Value *&Step);
 
 /// Return true if RHS is known to be implied true by LHS.  Return false if
diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp
index 6e2f0ebde9bb6c..d9c2ce4df92e7c 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -1489,7 +1489,7 @@ static void computeKnownBitsFromOperator(const Operator *I,
   }
   case Instruction::PHI: {
     const PHINode *P = cast<PHINode>(I);
-    BinaryOperator *BO = nullptr;
+    Instruction *BO = nullptr;
     Value *R = nullptr, *L = nullptr;
     if (matchSimpleRecurrence(P, BO, R, L)) {
       // Handle the case of a simple two-predecessor recurrence PHI.
@@ -1553,6 +1553,7 @@ static void computeKnownBitsFromOperator(const Operator *I,
       case Instruction::Sub:
       case Instruction::And:
       case Instruction::Or:
+      case Instruction::GetElementPtr:
       case Instruction::Mul: {
         // Change the context instruction to the "edge" that flows into the
         // phi. This is important because that is where the value is actually
@@ -1571,12 +1572,21 @@ static void computeKnownBitsFromOperator(const Operator *I,
 
         // We need to take the minimum number of known bits
         KnownBits Known3(BitWidth);
+        if (BitWidth != getBitWidth(L->getType(), Q.DL)) {
+          assert(isa<GetElementPtrInst>(BO) &&
+                 "Bitwidth should only be different for GEPs.");
+          break;
+        }
         RecQ.CxtI = LInst;
         computeKnownBits(L, DemandedElts, Known3, Depth + 1, RecQ);
 
         Known.Zero.setLowBits(std::min(Known2.countMinTrailingZeros(),
                                        Known3.countMinTrailingZeros()));
 
+        // Don't apply logic below for GEPs.
+        if (isa<GetElementPtrInst>(BO))
+          break;
+
         auto *OverflowOp = dyn_cast<OverflowingBinaryOperator>(BO);
         if (!OverflowOp || !Q.IIQ.hasNoSignedWrap(OverflowOp))
           break;
@@ -1737,6 +1747,7 @@ static void computeKnownBitsFromOperator(const Operator *I,
           Known.resetAll();
       }
     }
+
     if (const IntrinsicInst *II = dyn_cast<IntrinsicInst>(I)) {
       switch (II->getIntrinsicID()) {
       default:
@@ -2270,7 +2281,7 @@ void computeKnownBits(const Value *V, const APInt &DemandedElts,
 /// always a power of two (or zero).
 static bool isPowerOfTwoRecurrence(const PHINode *PN, bool OrZero,
                                    unsigned Depth, SimplifyQuery &Q) {
-  BinaryOperator *BO = nullptr;
+  Instruction *BO = nullptr;
   Value *Start = nullptr, *Step = nullptr;
   if (!matchSimpleRecurrence(PN, BO, Start, Step))
     return false;
@@ -2308,7 +2319,7 @@ static bool isPowerOfTwoRecurrence(const PHINode *PN, bool OrZero,
     // Divisor must be a power of two.
     // If OrZero is false, cannot guarantee induction variable is non-zero after
     // division, same for Shr, unless it is exact division.
-    return (OrZero || Q.IIQ.isExact(BO)) &&
+    return (OrZero || Q.IIQ.isExact(cast<BinaryOperator>(BO))) &&
            isKnownToBeAPowerOfTwo(Step, false, Depth, Q);
   case Instruction::Shl:
     return OrZero || Q.IIQ.hasNoUnsignedWrap(BO) || Q.IIQ.hasNoSignedWrap(BO);
@@ -2317,7 +2328,7 @@ static bool isPowerOfTwoRecurrence(const PHINode *PN, bool OrZero,
       return false;
     [[fallthrough]];
   case Instruction::LShr:
-    return OrZero || Q.IIQ.isExact(BO);
+    return OrZero || Q.IIQ.isExact(cast<BinaryOperator>(BO));
   default:
     return false;
   }
@@ -2727,7 +2738,7 @@ static bool rangeMetadataExcludesValue(const MDNode* Ranges, const APInt& Value)
 /// Try to detect a recurrence that monotonically increases/decreases from a
 /// non-zero starting value. These are common as induction variables.
 static bool isNonZeroRecurrence(const PHINode *PN) {
-  BinaryOperator *BO = nullptr;
+  Instruction *BO = nullptr;
   Value *Start = nullptr, *Step = nullptr;
   const APInt *StartC, *StepC;
   if (!matchSimpleRecurrence(PN, BO, Start, Step) ||
@@ -3560,9 +3571,9 @@ getInvertibleOperands(const Operator *Op1,
     // If PN1 and PN2 are both recurrences, can we prove the entire recurrences
     // are a single invertible function of the start values? Note that repeated
     // application of an invertible function is also invertible
-    BinaryOperator *BO1 = nullptr;
+    Instruction *BO1 = nullptr;
     Value *Start1 = nullptr, *Step1 = nullptr;
-    BinaryOperator *BO2 = nullptr;
+    Instruction *BO2 = nullptr;
     Value *Start2 = nullptr, *Step2 = nullptr;
     if (PN1->getParent() != PN2->getParent() ||
         !matchSimpleRecurrence(PN1, BO1, Start1, Step1) ||
@@ -9199,6 +9210,17 @@ llvm::canConvertToMinOrMaxIntrinsic(ArrayRef<Value *> VL) {
 
 bool llvm::matchSimpleRecurrence(const PHINode *P, BinaryOperator *&BO,
                                  Value *&Start, Value *&Step) {
+  Instruction *I;
+  if (matchSimpleRecurrence(P, I, Start, Step)) {
+    BO = dyn_cast<BinaryOperator>(I);
+    if (BO)
+      return true;
+  }
+  return false;
+}
+
+bool llvm::matchSimpleRecurrence(const PHINode *P, Instruction *&BO,
+                                 Value *&Start, Value *&Step) {
   // Handle the case of a simple two-predecessor recurrence PHI.
   // There's a lot more that could theoretically be done here, but
   // this is sufficient to catch some interesting cases.
@@ -9208,7 +9230,7 @@ bool llvm::matchSimpleRecurrence(const PHINode *P, BinaryOperator *&BO,
   for (unsigned i = 0; i != 2; ++i) {
     Value *L = P->getIncomingValue(i);
     Value *R = P->getIncomingValue(!i);
-    auto *LU = dyn_cast<BinaryOperator>(L);
+    auto *LU = dyn_cast<Instruction>(L);
     if (!LU)
       continue;
     unsigned Opcode = LU->getOpcode();
@@ -9240,6 +9262,21 @@ bool llvm::matchSimpleRecurrence(const PHINode *P, BinaryOperator *&BO,
 
       break; // Match!
     }
+    case Instruction::GetElementPtr: {
+      if (LU->getNumOperands() != 2 ||
+          !cast<GetElementPtrInst>(L)->getSourceElementType()->isIntegerTy(8))
+        continue;
+
+      Value *LL = LU->getOperand(0);
+      Value *LR = LU->getOperand(1);
+      // Find a recurrence.
+      if (LL == P) {
+        // Found a match
+        L = LR;
+        break;
+      }
+      continue;
+    }
     };
 
     // We have matched a recurrence of the form:
@@ -9256,9 +9293,9 @@ bool llvm::matchSimpleRecurrence(const PHINode *P, BinaryOperator *&BO,
   return false;
 }
 
-bool llvm::matchSimpleRecurrence(const BinaryOperator *I, PHINode *&P,
+bool llvm::matchSimpleRecurrence(const Instruction *I, PHINode *&P,
                                  Value *&Start, Value *&Step) {
-  BinaryOperator *BO = nullptr;
+  Instruction *BO = nullptr;
   P = dyn_cast<PHINode>(I->getOperand(0));
   if (!P)
     P = dyn_cast<PHINode>(I->getOperand(1));
diff --git a/llvm/test/Transforms/InferAlignment/gep-recurrence.ll b/llvm/test/Transforms/InferAlignment/gep-recurrence.ll
new file mode 100644
index 00000000000000..f51875adcd862f
--- /dev/null
+++ b/llvm/test/Transforms/InferAlignment/gep-recurrence.ll
@@ -0,0 +1,574 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 2
+; RUN: opt < %s -passes=infer-alignment -S | FileCheck %s
+
+target datalayout = "p1:64:64:64:32"
+
+declare i1 @cond()
+
+define void @test_recur_i8_128(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_128
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    store i64 0, ptr [[IV]], align 128
+; CHECK-NEXT:    [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 128
+; CHECK-NEXT:    [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT:    br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret void
+;
+entry:
+  br label %loop
+
+loop:
+  %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+  store i64 0, ptr %iv, align 1
+  %iv.next = getelementptr inbounds i8, ptr %iv, i64 128
+  %c = call i1 @cond()
+  br i1 %c, label %loop, label %exit
+
+exit:
+  ret void
+}
+
+define void @test_recur_i8_128_no_inbounds(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_128_no_inbounds
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    store i64 0, ptr [[IV]], align 128
+; CHECK-NEXT:    [[IV_NEXT]] = getelementptr i8, ptr [[IV]], i64 128
+; CHECK-NEXT:    [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT:    br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret void
+;
+entry:
+  br label %loop
+
+loop:
+  %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+  store i64 0, ptr %iv, align 1
+  %iv.next = getelementptr i8, ptr %iv, i64 128
+  %c = call i1 @cond()
+  br i1 %c, label %loop, label %exit
+
+exit:
+  ret void
+}
+
+define void @test_recur_i8_64(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_64
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    store i64 0, ptr [[IV]], align 64
+; CHECK-NEXT:    [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 64
+; CHECK-NEXT:    [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT:    br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret void
+;
+entry:
+  br label %loop
+
+loop:
+  %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+  store i64 0, ptr %iv, align 1
+  %iv.next = getelementptr inbounds i8, ptr %iv, i64 64
+  %c = call i1 @cond()
+  br i1 %c, label %loop, label %exit
+
+exit:
+  ret void
+}
+
+define void @test_recur_i8_63(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_63
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    store i64 0, ptr [[IV]], align 1
+; CHECK-NEXT:    [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 63
+; CHECK-NEXT:    [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT:    br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret void
+;
+entry:
+  br label %loop
+
+loop:
+  %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+  store i64 0, ptr %iv, align 1
+  %iv.next = getelementptr inbounds i8, ptr %iv, i64 63
+  %c = call i1 @cond()
+  br i1 %c, label %loop, label %exit
+
+exit:
+  ret void
+}
+
+define void @test_recur_i8_32(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_32
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    store i64 0, ptr [[IV]], align 32
+; CHECK-NEXT:    [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 32
+; CHECK-NEXT:    [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT:    br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret void
+;
+entry:
+  br label %loop
+
+loop:
+  %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+  store i64 0, ptr %iv, align 1
+  %iv.next = getelementptr inbounds i8, ptr %iv, i64 32
+  %c = call i1 @cond()
+  br i1 %c, label %loop, label %exit
+
+exit:
+  ret void
+}
+
+define void @test_recur_i8_16(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_16
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    store i64 0, ptr [[IV]], align 16
+; CHECK-NEXT:    [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 16
+; CHECK-NEXT:    [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT:    br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret void
+;
+entry:
+  br label %loop
+
+loop:
+  %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+  store i64 0, ptr %iv, align 1
+  %iv.next = getelementptr inbounds i8, ptr %iv, i64 16
+  %c = call i1 @cond()
+  br i1 %c, label %loop, label %exit
+
+exit:
+  ret void
+}
+
+define void @test_recur_i8_8(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_8
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    store i64 0, ptr [[IV]], align 8
+; CHECK-NEXT:    [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 8
+; CHECK-NEXT:    [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT:    br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret void
+;
+entry:
+  br label %loop
+
+loop:
+  %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+  store i64 0, ptr %iv, align 1
+  %iv.next = getelementptr inbounds i8, ptr %iv, i64 8
+  %c = call i1 @cond()
+  br i1 %c, label %loop, label %exit
+
+exit:
+  ret void
+}
+
+define void @test_recur_i8_4(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_4
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    store i64 0, ptr [[IV]], align 4
+; CHECK-NEXT:    [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 4
+; CHECK-NEXT:    [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT:    br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret void
+;
+entry:
+  br label %loop
+
+loop:
+  %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+  store i64 0, ptr %iv, align 1
+  %iv.next = getelementptr inbounds i8, ptr %iv, i64 4
+  %c = call i1 @cond()
+  br i1 %c, label %loop, label %exit
+
+exit:
+  ret void
+}
+
+define void @test_recur_i8_2(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_2
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    store i64 0, ptr [[IV]], align 2
+; CHECK-NEXT:    [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 2
+; CHECK-NEXT:    [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT:    br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret void
+;
+entry:
+  br label %loop
+
+loop:
+  %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+  store i64 0, ptr %iv, align 1
+  %iv.next = getelementptr inbounds i8, ptr %iv, i64 2
+  %c = call i1 @cond()
+  br i1 %c, label %loop, label %exit
+
+exit:
+  ret void
+}
+
+define void @test_recur_i8_1(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_1
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    store i64 0, ptr [[IV]], align 1
+; CHECK-NEXT:    [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 1
+; CHECK-NEXT:    [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT:    br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret void
+;
+entry:
+  br label %loop
+
+loop:
+  %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+  store i64 0, ptr %iv, align 1
+  %iv.next = getelementptr inbounds i8, ptr %iv, i64 1
+  %c = call i1 @cond()
+  br i1 %c, label %loop, label %exit
+
+exit:
+  ret void
+}
+
+define void @test_recur_i8_unknown_step(ptr align 128 %dst, i64 %off) {
+; CHECK-LABEL: define void @test_recur_i8_unknown_step
+; CHECK-SAME: (ptr align 128 [[DST:%.*]], i64 [[OFF:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    store i64 0, ptr [[IV]], align 1
+; CHECK-NEXT:    [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 [[OFF]]
+; CHECK-NEXT:    [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT:    br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret void
+;
+entry:
+  br label %loop
+
+loop:
+  %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+  store i64 0, ptr %iv, align 1
+  %iv.next = getelementptr inbounds i8, ptr %iv, i64 %off
+  %c = call i1 @cond()
+  br i1 %c, label %loop, label %exit
+
+exit:
+  ret void
+}
+
+define void @test_recur_i8_step_known_multiple(ptr align 128 %dst, i64 %off) {
+; CHECK-LABEL: define void @test_recur_i8_step_known_multiple
+; CHECK-SAME: (ptr align 128 [[DST:%.*]], i64 [[OFF:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[UREM:%.*]] = urem i64 [[OFF]], 128
+; CHECK-NEXT:    [[C_UREM:%.*]] = icmp eq i64 [[UREM]], 0
+; CHECK-NEXT:    [[C_POS:%.*]] = icmp sge i64 [[OFF]], 0
+; CHECK-NEXT:    [[AND:%.*]] = and i1 [[C_UREM]], [[C_POS]]
+; CHECK-NEXT:    br i1 [[AND]], label [[LOOP:%.*]], label [[EXIT:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    store i64 0, ptr [[IV]], align 1
+; CHECK-NEXT:    [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 [[OFF]]
+; CHECK-NEXT:    [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT:    br i1 [[C]], label [[LOOP]], label [[EXIT]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret void
+;
+entry:
+  %urem = urem i64 %off, 128
+  %c.urem = icmp eq i64 %urem, 0
+  %c.pos = icmp sge i64 %off, 0
+  %and = and i1 %c.urem, %c.pos
+  br i1 %and, label %loop, label %exit
+
+loop:
+  %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+  store i64 0, ptr %iv, align 1
+  %iv.next = getelementptr inbounds i8, ptr %iv, i64 %off
+  %c = call i1 @cond()
+  br i1 %c, label %loop, label %exit
+
+exit:
+  ret void
+}
+
+define void @test_recur_i8_i16_128(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_i16_128
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    store i64 0, ptr [[IV]], align 1
+; CHECK-NEXT:    [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i16 128
+; CHECK-NEXT:    [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT:    br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret void
+;
+entry:
+  br label %loop
+
+loop:
+  %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+  store i64 0, ptr %iv, align 1
+  %iv.next = getelementptr inbounds i8, ptr %iv, i16 128
+  %c = call i1 @cond()
+  br i1 %c, label %loop, label %exit
+
+exit:
+  ret void
+}
+
+define void @test_recur_i8_i8_132(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_i8_132
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHEC...
[truncated]

@@ -1489,7 +1489,7 @@ static void computeKnownBitsFromOperator(const Operator *I,
}
case Instruction::PHI: {
const PHINode *P = cast<PHINode>(I);
BinaryOperator *BO = nullptr;
Instruction *BO = nullptr;
Copy link
Contributor

@goldsteinn goldsteinn Jan 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think BO should be renamed. That's probably not worth it given the large amount of unrelated diffs it will create.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could rename it as NFC after the change lands?

@@ -9240,6 +9262,21 @@ bool llvm::matchSimpleRecurrence(const PHINode *P, BinaryOperator *&BO,

break; // Match!
}
case Instruction::GetElementPtr: {
if (LU->getNumOperands() != 2 ||
!cast<GetElementPtrInst>(L)->getSourceElementType()->isIntegerTy(8))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you just match LU with m_PtrAdd?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done thanks

@fhahn
Copy link
Contributor Author

fhahn commented Jan 21, 2025

llvm-opt-benchmark results: dtcxzyw/llvm-opt-benchmark#1982

Stronger alignment in many cases

fhahn added 3 commits January 21, 2025 11:12
Update matchSimpleRecurrence to also support GEPs. This allows inferring
larger alignments in a number of cases.

I noticed that we fail to infer alignments from calls when dropping
assumptions; inferring alignment from assumptions uses SCEV, if we drop
an assume for a aligned function return value, we fail to infer the
better alignment in InferAlignment without this patch.

It comes with a bit of a compile-time impact:

stage1-O3: +0.05%
stage1-ReleaseThinLTO: +0.04%
stage1-ReleaseLTO-g: +0.03%
stage1-O0-g: -0.04%
stage2-O3: +0.04%
stage2-O0-g: +0.02%
stage2-clang: +0.03%

https://llvm-compile-time-tracker.com/compare.php?from=a8c60790fd4f70a461113f0721bdb4a114ddf420&to=9a207c52e9c644691573a40ceb5b89a3c09ab609&stat=instructions:u
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants