-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid reusing temps whose refs might be captured #76009
base: main
Are you sure you want to change the base?
Conversation
// At the same time, these expression locals might be lifted to the containing block | ||
// (to avoid reusing them if they might be captured by a ref struct). | ||
// Then we use this map to keep track of the redeclared locals. | ||
private Dictionary<LocalDefinition, LocalDefinition>? _redeclaredLocals; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It maps from the old LocalDefinition to the new LocalDefinition (which is redeclaring/shadowing the old one). Does that make sense?
I will try to explain it in the comment. Thanks.
@@ -887,7 +887,7 @@ public BoundCall Call(BoundExpression? receiver, MethodSymbol method, ImmutableA | |||
return new BoundCall( | |||
Syntax, receiver, initialBindingReceiverIsSubjectToCloning: ThreeState.Unknown, method, args, | |||
argumentNamesOpt: default(ImmutableArray<String?>), argumentRefKindsOpt: refKinds, isDelegateCall: false, expanded: false, invokedAsExtensionMethod: false, | |||
argsToParamsOpt: ImmutableArray<int>.Empty, defaultArguments: default(BitVector), resultKind: LookupResultKind.Viable, type: method.ReturnType) | |||
argsToParamsOpt: default, defaultArguments: default(BitVector), resultKind: LookupResultKind.Viable, type: method.ReturnType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the utility Binder.GetCorrespondingParameter
that's used in this PR as part of CodeGenerator.MightEscapeTemporaryRefs
would fail on an assert - it expects argsToParamsOpt
to be either default or matching the number of arguments:
Debug.Assert(argumentOrdinal < argsToParamsOpt.Length); |
Which seems like a reasonable invariant which this callsite was violating.
@@ -673,6 +673,8 @@ private void EmitBlock(BoundBlock block) | |||
{ | |||
EmitUninstrumentedBlock(block); | |||
} | |||
|
|||
ReleaseBlockTemps(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we cannot assume that this block necessary maps to one syntactically present in source
Why cannot we assume that? To me it seems we don't synthesize BoundBlocks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why cannot we assume that? To me it seems we don't synthesize BoundBlocks.
Because we shouldn't be assuming that. There is nothing wrong in introducing a bound block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is nothing wrong in introducing a bound block.
This comment suggests otherwise:
BoundBlock specify SCOPE (visibility) of a variable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment suggests otherwise: ...
I am failing to guess what part of the comment you find relevant. Could you quote that part, etc.?
For example, SyntheticBoundNodeFactory
has a bunch of helpers to synthesize blocks and they are used for one reason or the other. There is also a way to synthesize a block manually. So, even if you haven't found an example, it doesn't mean there isn't one already, or that one won't be introduced in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I was referring to this line: "BoundBlock specify SCOPE (visibility) of a variable" but that doesn't imply it's wrong to synthesize blocks, you're right.
Do you have suggestions how to best solve this?
I'm thinking if we can assume blocks are never lost during lowering, we could perhaps make sure synthesized blocks are marked as such (e.g., via WasCompilerGenerated; or maybe we could mark user-defined blocks instead) and then we could call ReleaseBlockTemps only for non-compiler-generated blocks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have suggestions how to best solve this?
I am leaning towards a simpler and a more conservative approach - simply dropping "might escape" temps from _expressionTemps
as EmitAssignmentValue
does, or not adding them to that list, whichever is more natural from the implementation perspective.
@@ -64,6 +64,23 @@ public override bool Equals(object? obj) | |||
// maps local identities to locals. | |||
private Dictionary<ILocalSymbolInternal, LocalDefinition>? _localMap; | |||
|
|||
// The lowered tree might define the same local symbol | |||
// in multiple sequences that are part of one expression, for example: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in multiple sequences that are part of one expression, for example:
This doesn't sound right. I think we should fix this, a local symbol should belong to exactly one scope in the bound tree. I assume this is a pre-existing condition, i.e. it is not introduced by this change. If so, I would prefer the fix to go into a separate PR. #Pending
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, I will look into that.
Just to confirm: I think this "invalid" tree shape is introduced somewhere in lowering simply by reusing the same node twice, e.g., something like _factory.Call(arguments: [node, node])
, and that node
happens to contain a BoundSequence which then causes the local to be declared twice in the tree. So are you saying we should never do that (reuse the same node) and if we do it somewhere, it's a bug (hopefully a rare one)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, we can reuse bound nodes as long as that doesn't violate some invariants. For example, we reuse some primitive nodes: for constants, for local references (BoundLocal), etc. I consider "a local symbol used in a tree should belong to exactly one scope in the tree" as one of the invariants that is violated in the specific scenario.
|
||
for (var arg = 0; arg < arguments.Length; arg++) | ||
{ | ||
var parameter = Binder.GetCorrespondingParameter( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
arguments, | ||
method.Parameters, | ||
call.ArgumentRefKindsOpt, | ||
mightEscapeTemporaryRefs: MightEscapeTemporaryRefs(call, used: useKind != UseKind.Unused, receiverAddressKind: null)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -1754,7 +1767,11 @@ private void EmitInstanceCallExpression(BoundCall call, UseKind useKind) | |||
} | |||
} | |||
|
|||
emitArgumentsAndCallEpilogue(call, callKind, receiverUseKind); | |||
emitArgumentsAndCallEpilogue(call, callKind, receiverUseKind, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expression.Arguments, | ||
constructor.Parameters, | ||
expression.ArgumentRefKindsOpt, | ||
mightEscapeTemporaryRefs: MightEscapeTemporaryRefs(expression, used)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -1640,13 +1639,11 @@ private void RewriteArgumentsForComCall( | |||
|
|||
actualArguments[argIndex] = new BoundSequence( | |||
argument.Syntax, | |||
locals: ImmutableArray<LocalSymbol>.Empty, | |||
sideEffects: ImmutableArray.Create<BoundExpression>(boundAssignmentToTemp), | |||
locals: [boundTemp.LocalSymbol], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strictly speaking this is wrong. It looks like we are going to pass a reference to a local that goes out of scope. I know, we probably have done things like that in other places and EmitSequenceAddress
compensates for situations like that by not releasing the locals. But that is likely to lead to a worse IL. I suggest reverting the change. #Pending
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we are going to pass a reference to a local that goes out of scope.
True, but even without the change, the temp might live too short - see test RefTemp_Escapes_ComCall
. Moving the temp into the sequence inside the call ensures the emit layer does not free the temp after it detects the call might escape temps. If the temp is declared outside the call, the emit layer does not see that it is a part of a "might escape" call and that its freeing should be avoided.
It turns out this change is what causes the problem with one local being declared twice in one lowered subtree. But I imagine this can be fixed in the callers of this method.
But that is likely to lead to a worse IL
The mentioned test verifies IL and at least in that scenario it does not look worse.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It turns out this change is what causes the locals being declared twice in one lowered tree. But I imagine this can be fixed in the callers of this method.
That did not seem as a good approach after all - the caller (VisitCompoundAssignmentOperator) expects all temps to be lifted up so the resulting node can be reused. Changing that would likely require some rewriter that would replace redeclared locals in the reused subtree. I'm going to investigate different approaches.
ImmutableArray<ParameterSymbol> parameters, | ||
ImmutableArray<BoundExpression> arguments, | ||
ImmutableArray<int> argsToParamsOpt, | ||
bool expanded) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The last two parameters should not be relevant post lowering. #Pending
{ | ||
readonlyRefs++; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expanded: false); | ||
} | ||
|
||
private static bool MightEscapeTemporaryRefs( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done with review pass (commit 1), tests are not looked at. |
Fixes #67435.
The idea is to have a "heuristic" to detect whether a method might capture the references passed to it. If such method call is detected, and a temporary reference is being emitted, we lift the temp to live for the whole block instead of just the expression.
The block lifetime is enough - ref safety analysis already checks refs to rvalues cannot escape blocks.
The heuristic is implemented by
CodeGenerator.MightEscapeTemporaryRefs
. It runs on the lowered nodes (because it's the emit layer which decides to emit a temporary). It might have false positives (some calls likeM(rvalue, out _)
might be marked by the heuristic as dangerous but they are not), but it shouldn't have false negatives.Without a heuristic, we would need to avoid reusing many more temps, which would be a regression (at least in IL size). But perhaps that's negligible and it would be better to avoid this complexity? I'm not sure.