Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AMDGPU] Respect MBB alignment in the getFunctionCodeSize() #127142

Conversation

rampitec
Copy link
Collaborator

No description provided.

Copy link
Collaborator Author

rampitec commented Feb 13, 2025

@llvmbot
Copy link
Member

llvmbot commented Feb 13, 2025

@llvm/pr-subscribers-backend-amdgpu

Author: Stanislav Mekhanoshin (rampitec)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/127142.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/SIProgramInfo.cpp (+2)
  • (modified) llvm/test/CodeGen/AMDGPU/code-size-estimate.mir (+89)
diff --git a/llvm/lib/Target/AMDGPU/SIProgramInfo.cpp b/llvm/lib/Target/AMDGPU/SIProgramInfo.cpp
index b995687e71780..9d9b4c83ac388 100644
--- a/llvm/lib/Target/AMDGPU/SIProgramInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIProgramInfo.cpp
@@ -212,6 +212,8 @@ uint64_t SIProgramInfo::getFunctionCodeSize(const MachineFunction &MF) {
   uint64_t CodeSize = 0;
 
   for (const MachineBasicBlock &MBB : MF) {
+    CodeSize = alignTo(CodeSize, MBB.getAlignment());
+
     for (const MachineInstr &MI : MBB) {
       // TODO: CodeSize should account for multiple functions.
 
diff --git a/llvm/test/CodeGen/AMDGPU/code-size-estimate.mir b/llvm/test/CodeGen/AMDGPU/code-size-estimate.mir
index 76eaf350301e4..9ae536af6f0e9 100644
--- a/llvm/test/CodeGen/AMDGPU/code-size-estimate.mir
+++ b/llvm/test/CodeGen/AMDGPU/code-size-estimate.mir
@@ -31,3 +31,92 @@ body:             |
 
   WAVE_BARRIER
 ...
+
+# CHECK: align4:                                 ; @align4
+# CHECK: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) ; encoding: [0x00,0x00,0x8c,0xbf]
+# CHECK: s_cbranch_scc1 .LBB{{[0-9_]+}}          ; encoding: [A,A,0x85,0xbf]
+# CHECK: s_barrier                               ; encoding: [0x00,0x00,0x8a,0xbf]
+# CHECK: .p2align        2
+# CHECK: s_endpgm                                ; encoding: [0x00,0x00,0x81,0xbf]
+# CHECK: ; codeLenInByte = 16
+
+---
+name:            align4
+tracksRegLiveness: true
+body:             |
+  bb.0:
+    $scc = IMPLICIT_DEF
+    S_CBRANCH_SCC1 %bb.2, implicit $scc
+
+  bb.1:
+    S_BARRIER
+
+  bb.2 (align 4):
+    S_ENDPGM 0
+...
+
+# CHECK: align8:                                 ; @align8
+# CHECK: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) ; encoding: [0x00,0x00,0x8c,0xbf]
+# CHECK: s_cbranch_scc1 .LBB{{[0-9_]+}}          ; encoding: [A,A,0x85,0xbf]
+# CHECK: s_barrier                               ; encoding: [0x00,0x00,0x8a,0xbf]
+# CHECK: .p2align        3
+# CHECK: s_endpgm                                ; encoding: [0x00,0x00,0x81,0xbf]
+# CHECK: ; codeLenInByte = 20
+---
+name:            align8
+tracksRegLiveness: true
+body:             |
+  bb.0:
+    $scc = IMPLICIT_DEF
+    S_CBRANCH_SCC1 %bb.2, implicit $scc
+
+  bb.1:
+    S_BARRIER
+
+  bb.2 (align 8):
+    S_ENDPGM 0
+...
+
+# CHECK: align16:                                ; @align16
+# CHECK: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) ; encoding: [0x00,0x00,0x8c,0xbf]
+# CHECK: s_cbranch_scc1 .LBB{{[0-9_]+}}          ; encoding: [A,A,0x85,0xbf]
+# CHECK: s_barrier                               ; encoding: [0x00,0x00,0x8a,0xbf]
+# CHECK: .p2align        4
+# CHECK: s_endpgm                                ; encoding: [0x00,0x00,0x81,0xbf]
+# CHECK: ; codeLenInByte = 20
+---
+name:            align16
+tracksRegLiveness: true
+body:             |
+  bb.0:
+    $scc = IMPLICIT_DEF
+    S_CBRANCH_SCC1 %bb.2, implicit $scc
+
+  bb.1:
+    S_BARRIER
+
+  bb.2 (align 16):
+    S_ENDPGM 0
+...
+
+# CHECK: align32:                                ; @align32
+# CHECK: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) ; encoding: [0x00,0x00,0x8c,0xbf]
+# CHECK: s_cbranch_scc1 .LBB{{[0-9_]+}}          ; encoding: [A,A,0x85,0xbf]
+# CHECK: s_barrier                               ; encoding: [0x00,0x00,0x8a,0xbf]
+# CHECK: .p2align        5
+# CHECK: s_endpgm                                ; encoding: [0x00,0x00,0x81,0xbf]
+# CHECK: ; codeLenInByte = 36
+---
+name:            align32
+tracksRegLiveness: true
+body:             |
+  bb.0:
+    $scc = IMPLICIT_DEF
+    S_CBRANCH_SCC1 %bb.2, implicit $scc
+
+  bb.1:
+    S_BARRIER
+
+  bb.2 (align 32):
+    S_ENDPGM 0
+...

@rampitec rampitec force-pushed the users/rampitec/02-13-_amdgpu_respect_mbb_alignment_in_the_getfunctioncodesize_ branch from d01d168 to 63e9a99 Compare February 14, 2025 19:00
@rampitec
Copy link
Collaborator Author

Which one do you prefer, this or #127246? They are mutually exclusive.

@arsenm
Copy link
Contributor

arsenm commented Feb 18, 2025

Which one do you prefer, this or #127246? They are mutually exclusive.

They're not really. This one is the incremental step which adds the test, #127246 is the final form

@rampitec
Copy link
Collaborator Author

rampitec commented Feb 18, 2025

Which one do you prefer, this or #127246? They are mutually exclusive.

They're not really. This one is the incremental step which adds the test, #127246 is the final form

The test is meaningless if we overestimate. I.e., I have carefully set the margins in the test. If we go 'full align - 1' it does not make any sense to do it.

@rampitec
Copy link
Collaborator Author

And in any case it is a moot until baseline change is accepted.

@rampitec rampitec force-pushed the users/rampitec/02-13-_amdgpu_respect_mbb_alignment_in_the_getfunctioncodesize_ branch from 63e9a99 to b574a4b Compare February 18, 2025 08:45
@rampitec rampitec force-pushed the users/rampitec/02-13-_amdgpu_early_bail_in_getfunctioncodesize_for_meta_inst._nfc branch from c048954 to faf1cf6 Compare February 18, 2025 08:45
Base automatically changed from users/rampitec/02-13-_amdgpu_early_bail_in_getfunctioncodesize_for_meta_inst._nfc to main February 18, 2025 10:08
@rampitec rampitec force-pushed the users/rampitec/02-13-_amdgpu_respect_mbb_alignment_in_the_getfunctioncodesize_ branch from b574a4b to 5f35e2c Compare February 18, 2025 10:35
@rampitec rampitec merged commit 8529bd7 into main Feb 18, 2025
8 checks passed
@rampitec rampitec deleted the users/rampitec/02-13-_amdgpu_respect_mbb_alignment_in_the_getfunctioncodesize_ branch February 18, 2025 21:19
wldfngrs pushed a commit to wldfngrs/llvm-project that referenced this pull request Feb 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants