Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIE2] Fix pathological case where greedy reloads directly from a CR register #211

Merged
merged 2 commits into from
Oct 21, 2024

Conversation

gbossu
Copy link
Collaborator

@gbossu gbossu commented Oct 14, 2024

It turns out greedy does quite a lot of "operation folding" to get more compact spill code.

It will turn

%0:ercr = LDA_spill ...
$crSat = COPY %0

into

$crSat = LDA_spill ...

This is problematic because we cannot directly reload into a $crSat register. I actually expected such folding to be illegal, because after all, $crSat is a reserved register. And usually, reserved registers just mean "don't you dare optimize me". But it's also true that this register is part of eRCR (which it needs to be in for some of our copy-propagation extensions to kick in), so I guess what greedy does is fair game.

I could change const TargetRegisterClass *canFoldCopy(...) to actually never fold into reserved registers, but that causes one X86 test to fail. At this point, I'd rather just constrain eRCR to eR before greedy, because it's anyway what we want: the classes are equivalent in terms of allocatable registers, and eR is easy to spill.

No QoR impact, which is expected, but you never know.

const TargetRegisterInfo &TRI = *MRI.getTargetRegisterInfo();

// Clear previous cache, reserved registers can change for each function
SuitableRCForRA.clear();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can suggest that we don't need to clean this cache always (from the compile time perspective). You can encapsulate this map with a BitVector representing the current set or reserved regs in a standalone class/struct (you can access the reserved registers with BitVector getReservedRegs from TRI).

We can have, in the end something like:

SuitableRCForRA.clearIfDiverges(TRI) -> just clean if we have different sets here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

This is important as greedy feels free to fold a lot of operations into
spill/reload instructions as long as the register class allows it

In our case, we do not want to fold
%0:ercr = LD_reload ...
$crSat = COPY %0

into:
$crSat = LD_reload ...

as this makes generating spill code harder. We would need to bounce
through a new GPR anyway.
@gbossu gbossu force-pushed the gaetan.fix.spill.code branch from 12067ab to 43bce7f Compare October 17, 2024 13:40
Copy link
Collaborator

@andcarminati andcarminati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@gbossu gbossu merged commit b591550 into aie-public Oct 21, 2024
8 checks passed
@gbossu gbossu deleted the gaetan.fix.spill.code branch October 21, 2024 11:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants