[AMDGPU] Respect MBB alignment in the getFunctionCodeSize() #127142

rampitec · 2025-02-13T22:47:18Z

No description provided.

rampitec · 2025-02-13T22:47:35Z

[AMDGPU] Respect MBB alignment in the getFunctionCodeSize() #127142 👈 (View in Graphite)
[AMDGPU] Early bail in getFunctionCodeSize for meta inst. NFC. #127129
[AMDGPU] Move into SIProgramInfo and cache getFunctionCodeSize. NFCI. #127111 : 2 other dependent PRs (#126981 , #127246 )
main

This stack of pull requests is managed by Graphite. Learn more about stacking.

llvmbot · 2025-02-13T22:48:42Z

@llvm/pr-subscribers-backend-amdgpu

Author: Stanislav Mekhanoshin (rampitec)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/127142.diff

2 Files Affected:

(modified) llvm/lib/Target/AMDGPU/SIProgramInfo.cpp (+2)
(modified) llvm/test/CodeGen/AMDGPU/code-size-estimate.mir (+89)

diff --git a/llvm/lib/Target/AMDGPU/SIProgramInfo.cpp b/llvm/lib/Target/AMDGPU/SIProgramInfo.cpp
index b995687e71780..9d9b4c83ac388 100644
--- a/llvm/lib/Target/AMDGPU/SIProgramInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIProgramInfo.cpp
@@ -212,6 +212,8 @@ uint64_t SIProgramInfo::getFunctionCodeSize(const MachineFunction &MF) {
   uint64_t CodeSize = 0;
 
   for (const MachineBasicBlock &MBB : MF) {
+    CodeSize = alignTo(CodeSize, MBB.getAlignment());
+
     for (const MachineInstr &MI : MBB) {
       // TODO: CodeSize should account for multiple functions.
 
diff --git a/llvm/test/CodeGen/AMDGPU/code-size-estimate.mir b/llvm/test/CodeGen/AMDGPU/code-size-estimate.mir
index 76eaf350301e4..9ae536af6f0e9 100644
--- a/llvm/test/CodeGen/AMDGPU/code-size-estimate.mir
+++ b/llvm/test/CodeGen/AMDGPU/code-size-estimate.mir
@@ -31,3 +31,92 @@ body:             |
 
   WAVE_BARRIER
 ...
+
+# CHECK: align4:                                 ; @align4
+# CHECK: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) ; encoding: [0x00,0x00,0x8c,0xbf]
+# CHECK: s_cbranch_scc1 .LBB{{[0-9_]+}}          ; encoding: [A,A,0x85,0xbf]
+# CHECK: s_barrier                               ; encoding: [0x00,0x00,0x8a,0xbf]
+# CHECK: .p2align        2
+# CHECK: s_endpgm                                ; encoding: [0x00,0x00,0x81,0xbf]
+# CHECK: ; codeLenInByte = 16
+
+---
+name:            align4
+tracksRegLiveness: true
+body:             |
+  bb.0:
+    $scc = IMPLICIT_DEF
+    S_CBRANCH_SCC1 %bb.2, implicit $scc
+
+  bb.1:
+    S_BARRIER
+
+  bb.2 (align 4):
+    S_ENDPGM 0
+...
+
+# CHECK: align8:                                 ; @align8
+# CHECK: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) ; encoding: [0x00,0x00,0x8c,0xbf]
+# CHECK: s_cbranch_scc1 .LBB{{[0-9_]+}}          ; encoding: [A,A,0x85,0xbf]
+# CHECK: s_barrier                               ; encoding: [0x00,0x00,0x8a,0xbf]
+# CHECK: .p2align        3
+# CHECK: s_endpgm                                ; encoding: [0x00,0x00,0x81,0xbf]
+# CHECK: ; codeLenInByte = 20
+---
+name:            align8
+tracksRegLiveness: true
+body:             |
+  bb.0:
+    $scc = IMPLICIT_DEF
+    S_CBRANCH_SCC1 %bb.2, implicit $scc
+
+  bb.1:
+    S_BARRIER
+
+  bb.2 (align 8):
+    S_ENDPGM 0
+...
+
+# CHECK: align16:                                ; @align16
+# CHECK: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) ; encoding: [0x00,0x00,0x8c,0xbf]
+# CHECK: s_cbranch_scc1 .LBB{{[0-9_]+}}          ; encoding: [A,A,0x85,0xbf]
+# CHECK: s_barrier                               ; encoding: [0x00,0x00,0x8a,0xbf]
+# CHECK: .p2align        4
+# CHECK: s_endpgm                                ; encoding: [0x00,0x00,0x81,0xbf]
+# CHECK: ; codeLenInByte = 20
+---
+name:            align16
+tracksRegLiveness: true
+body:             |
+  bb.0:
+    $scc = IMPLICIT_DEF
+    S_CBRANCH_SCC1 %bb.2, implicit $scc
+
+  bb.1:
+    S_BARRIER
+
+  bb.2 (align 16):
+    S_ENDPGM 0
+...
+
+# CHECK: align32:                                ; @align32
+# CHECK: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) ; encoding: [0x00,0x00,0x8c,0xbf]
+# CHECK: s_cbranch_scc1 .LBB{{[0-9_]+}}          ; encoding: [A,A,0x85,0xbf]
+# CHECK: s_barrier                               ; encoding: [0x00,0x00,0x8a,0xbf]
+# CHECK: .p2align        5
+# CHECK: s_endpgm                                ; encoding: [0x00,0x00,0x81,0xbf]
+# CHECK: ; codeLenInByte = 36
+---
+name:            align32
+tracksRegLiveness: true
+body:             |
+  bb.0:
+    $scc = IMPLICIT_DEF
+    S_CBRANCH_SCC1 %bb.2, implicit $scc
+
+  bb.1:
+    S_BARRIER
+
+  bb.2 (align 32):
+    S_ENDPGM 0
+...

llvm/lib/Target/AMDGPU/SIProgramInfo.cpp

rampitec · 2025-02-17T18:35:07Z

Which one do you prefer, this or #127246? They are mutually exclusive.

arsenm · 2025-02-18T01:25:24Z

Which one do you prefer, this or #127246? They are mutually exclusive.

They're not really. This one is the incremental step which adds the test, #127246 is the final form

rampitec · 2025-02-18T01:33:11Z

Which one do you prefer, this or #127246? They are mutually exclusive.

They're not really. This one is the incremental step which adds the test, #127246 is the final form

The test is meaningless if we overestimate. I.e., I have carefully set the margins in the test. If we go 'full align - 1' it does not make any sense to do it.

rampitec · 2025-02-18T01:57:23Z

And in any case it is a moot until baseline change is accepted.

)

rampitec mentioned this pull request Feb 13, 2025

[AMDGPU] Early bail in getFunctionCodeSize for meta inst. NFC. #127129

Merged

rampitec requested a review from arsenm February 13, 2025 22:47

rampitec marked this pull request as ready for review February 13, 2025 22:48

llvmbot added the backend:AMDGPU label Feb 13, 2025

rampitec mentioned this pull request Feb 13, 2025

[AMDGPU] Set inst_pref_size to maximum #126981

Open

efriedma-quic reviewed Feb 13, 2025

View reviewed changes

llvm/lib/Target/AMDGPU/SIProgramInfo.cpp Show resolved Hide resolved

rampitec force-pushed the users/rampitec/02-13-_amdgpu_respect_mbb_alignment_in_the_getfunctioncodesize_ branch from d01d168 to 63e9a99 Compare February 14, 2025 19:00

arsenm approved these changes Feb 17, 2025

View reviewed changes

rampitec force-pushed the users/rampitec/02-13-_amdgpu_respect_mbb_alignment_in_the_getfunctioncodesize_ branch from 63e9a99 to b574a4b Compare February 18, 2025 08:45

rampitec force-pushed the users/rampitec/02-13-_amdgpu_early_bail_in_getfunctioncodesize_for_meta_inst._nfc branch from c048954 to faf1cf6 Compare February 18, 2025 08:45

Base automatically changed from users/rampitec/02-13-_amdgpu_early_bail_in_getfunctioncodesize_for_meta_inst._nfc to main February 18, 2025 10:08

[AMDGPU] Respect MBB alignment in the getFunctionCodeSize()

5f35e2c

rampitec force-pushed the users/rampitec/02-13-_amdgpu_respect_mbb_alignment_in_the_getfunctioncodesize_ branch from b574a4b to 5f35e2c Compare February 18, 2025 10:35

rampitec merged commit 8529bd7 into main Feb 18, 2025
8 checks passed

rampitec deleted the users/rampitec/02-13-_amdgpu_respect_mbb_alignment_in_the_getfunctioncodesize_ branch February 18, 2025 21:19

wldfngrs pushed a commit to wldfngrs/llvm-project that referenced this pull request Feb 19, 2025

[AMDGPU] Respect MBB alignment in the getFunctionCodeSize() (llvm#127142

02d00fd

)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AMDGPU] Respect MBB alignment in the getFunctionCodeSize() #127142

[AMDGPU] Respect MBB alignment in the getFunctionCodeSize() #127142

rampitec commented Feb 13, 2025

rampitec commented Feb 13, 2025 •

edited

Loading

llvmbot commented Feb 13, 2025

rampitec commented Feb 17, 2025

arsenm commented Feb 18, 2025

rampitec commented Feb 18, 2025 •

edited

Loading

rampitec commented Feb 18, 2025

[AMDGPU] Respect MBB alignment in the getFunctionCodeSize() #127142

[AMDGPU] Respect MBB alignment in the getFunctionCodeSize() #127142

Conversation

rampitec commented Feb 13, 2025

rampitec commented Feb 13, 2025 • edited Loading

llvmbot commented Feb 13, 2025

rampitec commented Feb 17, 2025

arsenm commented Feb 18, 2025

rampitec commented Feb 18, 2025 • edited Loading

rampitec commented Feb 18, 2025

rampitec commented Feb 13, 2025 •

edited

Loading

rampitec commented Feb 18, 2025 •

edited

Loading