-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AMDGPU] Respect MBB alignment in the getFunctionCodeSize() #127142
[AMDGPU] Respect MBB alignment in the getFunctionCodeSize() #127142
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
@llvm/pr-subscribers-backend-amdgpu Author: Stanislav Mekhanoshin (rampitec) ChangesFull diff: https://github.com/llvm/llvm-project/pull/127142.diff 2 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/SIProgramInfo.cpp b/llvm/lib/Target/AMDGPU/SIProgramInfo.cpp
index b995687e71780..9d9b4c83ac388 100644
--- a/llvm/lib/Target/AMDGPU/SIProgramInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIProgramInfo.cpp
@@ -212,6 +212,8 @@ uint64_t SIProgramInfo::getFunctionCodeSize(const MachineFunction &MF) {
uint64_t CodeSize = 0;
for (const MachineBasicBlock &MBB : MF) {
+ CodeSize = alignTo(CodeSize, MBB.getAlignment());
+
for (const MachineInstr &MI : MBB) {
// TODO: CodeSize should account for multiple functions.
diff --git a/llvm/test/CodeGen/AMDGPU/code-size-estimate.mir b/llvm/test/CodeGen/AMDGPU/code-size-estimate.mir
index 76eaf350301e4..9ae536af6f0e9 100644
--- a/llvm/test/CodeGen/AMDGPU/code-size-estimate.mir
+++ b/llvm/test/CodeGen/AMDGPU/code-size-estimate.mir
@@ -31,3 +31,92 @@ body: |
WAVE_BARRIER
...
+
+# CHECK: align4: ; @align4
+# CHECK: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) ; encoding: [0x00,0x00,0x8c,0xbf]
+# CHECK: s_cbranch_scc1 .LBB{{[0-9_]+}} ; encoding: [A,A,0x85,0xbf]
+# CHECK: s_barrier ; encoding: [0x00,0x00,0x8a,0xbf]
+# CHECK: .p2align 2
+# CHECK: s_endpgm ; encoding: [0x00,0x00,0x81,0xbf]
+# CHECK: ; codeLenInByte = 16
+
+---
+name: align4
+tracksRegLiveness: true
+body: |
+ bb.0:
+ $scc = IMPLICIT_DEF
+ S_CBRANCH_SCC1 %bb.2, implicit $scc
+
+ bb.1:
+ S_BARRIER
+
+ bb.2 (align 4):
+ S_ENDPGM 0
+...
+
+# CHECK: align8: ; @align8
+# CHECK: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) ; encoding: [0x00,0x00,0x8c,0xbf]
+# CHECK: s_cbranch_scc1 .LBB{{[0-9_]+}} ; encoding: [A,A,0x85,0xbf]
+# CHECK: s_barrier ; encoding: [0x00,0x00,0x8a,0xbf]
+# CHECK: .p2align 3
+# CHECK: s_endpgm ; encoding: [0x00,0x00,0x81,0xbf]
+# CHECK: ; codeLenInByte = 20
+---
+name: align8
+tracksRegLiveness: true
+body: |
+ bb.0:
+ $scc = IMPLICIT_DEF
+ S_CBRANCH_SCC1 %bb.2, implicit $scc
+
+ bb.1:
+ S_BARRIER
+
+ bb.2 (align 8):
+ S_ENDPGM 0
+...
+
+# CHECK: align16: ; @align16
+# CHECK: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) ; encoding: [0x00,0x00,0x8c,0xbf]
+# CHECK: s_cbranch_scc1 .LBB{{[0-9_]+}} ; encoding: [A,A,0x85,0xbf]
+# CHECK: s_barrier ; encoding: [0x00,0x00,0x8a,0xbf]
+# CHECK: .p2align 4
+# CHECK: s_endpgm ; encoding: [0x00,0x00,0x81,0xbf]
+# CHECK: ; codeLenInByte = 20
+---
+name: align16
+tracksRegLiveness: true
+body: |
+ bb.0:
+ $scc = IMPLICIT_DEF
+ S_CBRANCH_SCC1 %bb.2, implicit $scc
+
+ bb.1:
+ S_BARRIER
+
+ bb.2 (align 16):
+ S_ENDPGM 0
+...
+
+# CHECK: align32: ; @align32
+# CHECK: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) ; encoding: [0x00,0x00,0x8c,0xbf]
+# CHECK: s_cbranch_scc1 .LBB{{[0-9_]+}} ; encoding: [A,A,0x85,0xbf]
+# CHECK: s_barrier ; encoding: [0x00,0x00,0x8a,0xbf]
+# CHECK: .p2align 5
+# CHECK: s_endpgm ; encoding: [0x00,0x00,0x81,0xbf]
+# CHECK: ; codeLenInByte = 36
+---
+name: align32
+tracksRegLiveness: true
+body: |
+ bb.0:
+ $scc = IMPLICIT_DEF
+ S_CBRANCH_SCC1 %bb.2, implicit $scc
+
+ bb.1:
+ S_BARRIER
+
+ bb.2 (align 32):
+ S_ENDPGM 0
+...
|
d01d168
to
63e9a99
Compare
Which one do you prefer, this or #127246? They are mutually exclusive. |
The test is meaningless if we overestimate. I.e., I have carefully set the margins in the test. If we go 'full align - 1' it does not make any sense to do it. |
And in any case it is a moot until baseline change is accepted. |
63e9a99
to
b574a4b
Compare
c048954
to
faf1cf6
Compare
b574a4b
to
5f35e2c
Compare
No description provided.