-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AIE2] Add more memory/ptr combiners #214
Conversation
|
||
unsigned AIE2InstrInfo::getMaxLoadStoreSize() const { return 256; } | ||
|
||
bool AIE2InstrInfo::canCombineWithLoadStore(const MachineInstr &MI) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having some look-ahead in early combiners seems like a good idea indeed!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to look in more details, but those look like very useful combiners
Register ValReg = MI.getReg(0); | ||
const LLT ValTy = MRI.getType(ValReg); | ||
const bool IsLoad = isa<GLoad>(MI); | ||
MaxMemSize = TII.getMaxLoadStoreSize(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a specific reason we are passing & setting the MaxMemSize
in match & using that in apply ? We could have used TII.getMaxLoadStoreSize();
directly in apply ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could be. What I had in mind was a target specific match and a completely target independent apply. It gives the freedom to apply different splitting tragedies for future targets, considering different selection combines opportunities.
const AIEBaseInstrInfo &TII, | ||
unsigned &MaxMemSize) { | ||
|
||
Register ValReg = MI.getReg(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit : const Register
QoR results.
In this point, we are trying to squeeze performance, but as there are no free lunch, some regressions are expected, unfortunately. For |
32080b5
to
160c028
Compare
continue; | ||
|
||
// We can share a residual G_PTR_ADD. | ||
if (Use.getOperand(2).getReg() == PaddSrcOffset) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is wrong, we should compare against OffsetReg
....
160c028
to
e222e1e
Compare
New QoR numbers:
|
const LLT V64S8 = LLT::fixed_vector(64, 8); | ||
|
||
if (Ty == V16S32 || Ty == V32S16 || Ty == V64S8) | ||
return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Could this just be return Ty.isVector() && Ty.getSize() == 512;
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason is that it is not a good idea to break accumulators, because we will not combine them. My first trial was in this direction and I saw regressions.
} | ||
|
||
void llvm::applyLoadStoreSplit(GLoadStore &MI, MachineRegisterInfo &MRI, | ||
MachineIRBuilder &B, unsigned &MaxMemSize) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is MaxMemSize
passed by (non-const) reference? Can't this be a copy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can, I changed it!
This looks great! My only potential "concern" is about the non-const references used in combiners |
*Now, we can selectively split memory operations to enhance selection combiner opportunities.
e222e1e
to
13a71e3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, much needed work :)
*Now, we can selectively split memory operations to enhance selection combiner opportunities.
*QoR numbers under preparation.