Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] AggExprEmitter::VisitArrayInitLoopExpr implementation #1055

Open
smeenai opened this issue Nov 5, 2024 · 4 comments
Open

[RFC] AggExprEmitter::VisitArrayInitLoopExpr implementation #1055

smeenai opened this issue Nov 5, 2024 · 4 comments

Comments

@smeenai
Copy link
Collaborator

smeenai commented Nov 5, 2024

I'm seeking implementation advice for the CIRGen implementation of AggExprEmitter::VisitArrayInitLoopExpr. The ArrayInitLoopExpr docs explain the context; I'm specifically concerned with the copy constructor use case (e.g. https://godbolt.org/z/66anjPxa5 shows the initialization loop emitted by CodeGen).

For starters, CodeGen has logic to convert such initializations to a memcpy where applicable. ClangIR generally avoids this though (e.g. https://godbolt.org/z/W9n11czTc), so I assume that's the case here too. We may want to flesh out ConstructorMemcpyizer to still emit memcpy for non-record fields, but that's unrelated to this particular issue.

The CodeGen implementation of VisitArrayInitLoopExpr emits a loop to initialize each element. I'd started implementing it similarly in CIRGen, and then came across the cir.array.ctor op, which serves a pretty similar purpose. My questions are:

  • Does it make sense to use cir.array.ctor to implement VisitArrayInitLoopExpr? (The rest of this issue can be ignored if the answer is no.)
  • cir.array.ctor currently expects each member to be initialized by calling a constructor. Should we (a) change it to allow arbitrary initialization so that we can also use it for arrays of non-record types, (b) leave the op alone and emit a loop inside VisitArrayInitLoopExpr for non-record types, or (c) match CodeGen and emit memcpy for non-record types, so that VisitArrayInitLoopExpr is never entered for them? My inclination is either (a) or (c); (b) seems like a weird middle ground that's a bunch of extra work for no gain.
  • How should cleanups be handled? The one current user of cir.array.ctor bails out if there's cleanups. I'm thinking the cleanups should be handled during the lowering of the cir.array.ctor op itself instead of at the op creation site, but I haven't thought that through fully.
  • The op doesn't handle multi-dimensional arrays yet (https://godbolt.org/z/3zh3b1fso), which would be something else required for completeness.
@smeenai
Copy link
Collaborator Author

smeenai commented Nov 5, 2024

Ah, one thing I didn't consider is that cir.array.ctor currently just handles the same constructor expression being used for all elements, whereas here I need a loop where each destination element is copied from the corresponding source element. I'm wondering if I should extend cir.array.ctor for that case or introduce a new high-level op (or maybe even just generate a loop op and let lowering handle it).

@bcardosolopes
Copy link
Member

For starters, CodeGen has logic to convert such initializations to a memcpy where applicable. ClangIR generally avoids this though (e.g. https://godbolt.org/z/W9n11czTc), so I assume that's the case here too.

CIR needs a better representation here, it's going to be hard to reconstruct the memcpy from individual initializations.

We may want to flesh out ConstructorMemcpyizer to still emit memcpy for non-record fields, but that's unrelated to this particular issue.

Yes, likely a higher level op for that (as mentioned above).

The CodeGen implementation of VisitArrayInitLoopExpr emits a loop to initialize each element. I'd started implementing it similarly in CIRGen, and then came across the cir.array.ctor op, which serves a pretty similar purpose.

Note that we already have code that does that (generate CIR loops for init), see example in clang/test/CIR/CodeGen/array-init.c (and probably elsewhere too). My take is that we should raise all of these ops because it's going to be insane to reconstruct (and against the whole point of CIR too) - these type of the stuff I'd argue we should just do now, it's pretty clear improvement on reasoning about IR constructs.

  • Does it make sense to use cir.array.ctor to implement VisitArrayInitLoopExpr? (The rest of this issue can be ignored if the answer is no.)

I think so, you might need to tweak it a bit or even create another op, but +1 on the overall approach.

  • cir.array.ctor currently expects each member to be initialized by calling a constructor. Should we (a) change it to allow arbitrary initialization so that we can also use it for arrays of non-record types, (b) leave the op alone and emit a loop inside VisitArrayInitLoopExpr for non-record types, or (c) match CodeGen and emit memcpy for non-record types, so that VisitArrayInitLoopExpr is never entered for them? My inclination is either (a) or (c); (b) seems like a weird middle ground that's a bunch of extra work for no gain.

I like (a) best, but if you decide to experiment with (c), I would suggest emitting a cir.store we can later (loweringprepare) lower to memcpy if possible.

  • How should cleanups be handled? The one current user of cir.array.ctor bails out if there's cleanups. I'm thinking the cleanups should be handled during the lowering of the cir.array.ctor op itself instead of at the op creation site, but I haven't thought that through fully.

We can mark the op with a unitattr for cleanup to indicate extra handling, but probably best to start without cleanups (while adding proper asserts for the cleanup) and build from there so you can see the feasibility. It's possible that for the cleanup scenario we can also start with the version that generates a loop if that's easier to get correctness done first.

Nice catch, that can probably come as incremental work tho?

@bcardosolopes
Copy link
Member

I'm wondering if I should extend cir.array.ctor for that case or introduce a new high-level op (or maybe even just generate a loop op and let lowering handle it).

If you can extend/rename it to cover more cases, sounds good to me. If after trying that it feels odd, probably a good indication for a new op. Maybe cir.array.ctor should just be cir.array.init and ctor flavor is an optional setup.

@bcardosolopes
Copy link
Member

One extra thing to keep in mind: think about the use cases from clang/test/CIR/CodeGen/array-init.c when designing this, we want to use the same mechanism over there.

smeenai added a commit to smeenai/clangir that referenced this issue Nov 26, 2024
If a record type contains an array of non-record types, we can generate
a copy for it inside the copy constructor, as CodeGen does. CodeGen does
so for arrays of record types where applicable as well, but we'll want
to represent the construction of those explicitly, as outlined in
llvm#1055.
bcardosolopes pushed a commit that referenced this issue Nov 26, 2024
If a record type contains an array of non-record types, we can generate
a copy for it inside the copy constructor, as CodeGen does. CodeGen does
so for arrays of record types where applicable as well, but we'll want
to represent the construction of those explicitly, as outlined in
#1055.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants