Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid Blocking Finalizer Thread During Shutdown in SystemEvents #108489

Merged
merged 21 commits into from
Nov 13, 2024

Conversation

lonitra
Copy link
Member

@lonitra lonitra commented Oct 2, 2024

Fixes dotnet/winforms#11944
 
The SystemEvents thread can block the Finalizer thread indefinitely even though it is categorized as a background thread. When the main thread kicks off shutdown, the Finalizer thread will raise a ProcessExit event, which will callback to SystemEvents.Shutdown. This occurs before AppDomain.IsFinalizingForUnload or Environment.HasShutdownStarted is set to true. Due to this, any code that needs a response from the main thread cannot know main thread will not respond as it is in the middle of shutdown. SystemEvents.Shutdown has a Thread.Join call which waits for the SystemEvents thread to come back before completing the method, meaning that the Finalizer thread is blocked until SystemEvents thread finishes. This can cause synchronous callbacks to deadlock. An example of this is seen in dotnet/winforms#11944. The following is a callstack of the deadlock from issue repro

Main thread

[0x0]   win32u!ZwUserMsgWaitForMultipleObjectsEx+0x14   0xe08efce7f8   0x7ff9fa56b1ea   
[0x1]   combase!CCliModalLoop::BlockFn+0x12e   0xe08efce800   0x7ff9fa5a9cb5   
[0x2]   combase!ClassicSTAThreadWaitForHandles+0xa5   0xe08efce8a0   0x7ff9fa55c002  
[0x3]   combase!CoWaitForMultipleHandles+0xc2   0xe08efce9c0   0x7ff8e6ca384c  
[0x4]   coreclr!MsgWaitHelper+0x44   0xe08efcea00   0x7ff8e6c2db82   
[0x5]   coreclr!Thread::DoAppropriateAptStateWait+0x137a4e   0xe08efcea40   0x7ff8e6af5ea5   
[0x6]   coreclr!Thread::DoAppropriateWaitWorker+0x171   0xe08efcea80   0x7ff8e6af5ce4   
[0x7]   coreclr!Thread::DoAppropriateWait+0xb0   0xe08efceb50   0x7ff8e6b2b2e5   
[0x8]   coreclr!CLREventBase::WaitEx+0x27   (Inline Function)   (Inline Function)   
[0x9]   coreclr!CLREventBase::Wait+0x27   (Inline Function)   (Inline Function)  
[0xa]   coreclr!FinalizerThread::RaiseShutdownEvents+0x71   0xe08efcebf0   0x7ff8e6b2af7a   
[0xb]   coreclr!EEShutDownHelper+0x3b2   0xe08efcec40   0x7ff8e6b2ab8f   
[0xc]   coreclr!EEShutDown+0x7f   0xe08efcedc0   0x7ff8e6b2a890   
[0xd]   coreclr!CorHost2::UnloadAppDomain2+0x40   0xe08efcee00   0x7ff8e6bbcdc2   
[0xe]   coreclr!coreclr_shutdown_2+0x42   0xe08efcee30   0x7ff9c4eee8e3   
[0xf]   hostpolicy!coreclr_t::shutdown+0x52   (Inline Function)   (Inline Function)   

Finalizer thread:

System.Private.CoreLib.dll!System.Threading.Thread.Join(int millisecondsTimeout)
System.Private.CoreLib.dll!System.Threading.Thread.Join()
Microsoft.Win32.SystemEvents.dll!Microsoft.Win32.SystemEvents.Shutdown() Line 1106
Microsoft.Win32.SystemEvents.dll!Microsoft.Win32.SystemEvents.Shutdown(object sender, System.EventArgs e) Line 1123

SystemEvents thread:

System.Private.CoreLib.dll!System.Threading.WaitHandle.WaitOne(int millisecondsTimeout)
System.Windows.Forms.dll!System.Windows.Forms.Control.WaitForWaitHandle(System.Threading.WaitHandle waitHandle) Line 3649	
System.Windows.Forms.dll!System.Windows.Forms.Control.MarshaledInvoke(System.Windows.Forms.Control caller, System.Delegate method, object[] args, bool synchronous) Line 6625
System.Windows.Forms.dll!System.Windows.Forms.Control.Invoke(System.Delegate method, object[] args) Line 6111
System.Windows.Forms.dll!System.Windows.Forms.WindowsFormsSynchronizationContext.Send(System.Threading.SendOrPostCallback d, object state) Line 86
Microsoft.Win32.SystemEvents.dll!Microsoft.Win32.SystemEvents.SystemEventInvokeInfo.Invoke(bool checkFinalization, object[] args) Line 1317
Microsoft.Win32.SystemEvents.dll!Microsoft.Win32.SystemEvents.RaiseEvent(bool checkFinalization, object key, object[] args) Line 1024
Microsoft.Win32.SystemEvents.dll!Microsoft.Win32.SystemEvents.RaiseEvent(object key, object[] args) Line 990
Microsoft.Win32.SystemEvents.dll!Microsoft.Win32.SystemEvents.OnUserPreferenceChanged(int msg, nint wParam, nint lParam) Line 970 
Microsoft.Win32.SystemEvents.dll!Microsoft.Win32.SystemEvents.WindowProc(nint hWnd, int msg, nint wParam, nint lParam) Line 1170

SystemEvents thread had deferred to WindowsFormSynchronizationContext to fire events, which waits for tasks to complete on the main thread (this occurred prior to shut down). It cannot see that main thread is now trying to shut down, but the main thread stuck waiting for the Finalizer thread to finish, which is waiting for SystemEvents thread to finish.

This deadlock was made easier to run into in the common scenario via #53467. Prior to that change and in Framework, this type of deadlock was more difficult to hit because we did not spawn a message loop thread for SystemEvents for STA threads. This meant that SystemEvent messages were taken care of using the same message pump as the main thread, so there is a slimmer chance that a deadlock could occur in this way. We don't want to revert the change above because it has fixed a different type of deadlock and made behavior more consistent.

The change here avoids calling Thread.Join during shutdown to avoid blocking the Finalizer thread. Consequentially, because the Finalizer thread no longer waits for SystemEvents thread to finish before shutting down, SystemEvents.EventsThreadShutdown is no longer guaranteed to be called before the process exits so marking this as obsolete with recommendation to hook to AppDomain.ProcessExit instead. Note that in Framework SystemEvents.EventsThreadShutdown is never called for STA threads as s_windowThread is never created for STA threads and that event is only raised in https://github.com/dotnet/runtime/blob/main/src/libraries/Microsoft.Win32.SystemEvents/src/Microsoft/Win32/SystemEvents.cs#L1276 which only gets called if s_windowThread is created.

@lonitra
Copy link
Member Author

lonitra commented Oct 2, 2024

cc: @ericstj , @JeremyKuhne , @stephentoub

{
public abstract class ShutdownTest : SystemEventsTest
Copy link
Member Author

@lonitra lonitra Oct 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why this was originally marked as abstract but it seems that was causing the tests not to run for this class

RemoteExecutor.Invoke(() =>
{
// Block the SystemEvents thread. Regression test for https://github.com/dotnet/winforms/issues/11944
SystemEvents.UserPreferenceChanged += (o, e) => { while (true) { } };
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an artificial repro to make sure shutdown can happen despite a hang occuring. We will include a regression test of the issue scenario in the winforms repo.

public static bool EnableLegacySystemEventsShutdownThreadJoin
{
[MethodImpl(MethodImplOptions.AggressiveInlining)]
get => GetCachedSwitchValue("Switch.SystemEvents.EnableLegacySystemEventsShutdownThreadJoin", ref s_enableLegacySystemEventsShutdownThreadJoin);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you expect that this fix is likely going to break applications out there?

If we are going to introduce a config switch for this, are we also going to file a breaking change notice for this that explains when people should use this config switch?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I think it is unlikely, I suppose that there could be an app out there that absolutely needs to know when the SystemEvents thread has shutdown (though given how the code was in Framework, there wouldn't be a good way to know that for STA threads so this makes chances even slimmer). I'm not sure what scenario that would be and it would be nice to know about the specifics of the scenario if such cases exist. We could introduce the switch later if we get a report. That way we'd also have better understanding of the scenario.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think introducing config switches reactively only once there is actual compat problem is a better strategy. It keeps the code simple.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm OK without the compat switch for 10.0 - but if we backport to servicing I think we might need it.

We will have a breaking change with this - we discussed that we can no longer guarantee that EventsThreadShutdown will be raised.

@lonitra lonitra added the breaking-change Issue or PR that represents a breaking API or functional change over a prerelease. label Oct 7, 2024
@dotnet-policy-service dotnet-policy-service bot added the needs-breaking-change-doc-created Breaking changes need an issue opened with https://github.com/dotnet/docs/issues/new?template=dotnet label Oct 7, 2024
@jkotas
Copy link
Member

jkotas commented Oct 7, 2024

LGTM. @dotnet/area-microsoft-win32 Could you please take a look and merge if this change looks fine?

Co-authored-by: Jan Kotas <[email protected]>
@danmoseley
Copy link
Member

High quality PR description BTW 🙂

@ViktorHofer
Copy link
Member

Just double checking, should this API get obsoleted on all TFMs or just on net8.0, net9.0 or net10.0?

@lonitra
Copy link
Member Author

lonitra commented Nov 6, 2024

In this change I've removed shutdown handling, and my understanding is that the change will affect all TFMs is that correct? If so then I think this needs to be obsolete for all.

docs/project/list-of-diagnostics.md Outdated Show resolved Hide resolved
src/libraries/Common/src/System/Obsoletions.cs Outdated Show resolved Hide resolved
// is STA, as there are no guarantees this thread will pump nor still be alive
// for the desired duration.
return;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these just stylistic changes?

Copy link
Member Author

@lonitra lonitra Nov 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is just a style change. I inverted a couple if statement in this method and changed ==/!= to use is/is not

@ViktorHofer
Copy link
Member

ViktorHofer commented Nov 8, 2024

@lonitra sorry for the delay in responding. So the suppression file update is fine to take, I just double checked (and pushed a commit to your branch). It means that the current package uses a different ObsoleteAttribute type for netstandard2.0 than the previous 9.0.0-P7 package. This is expected and is fine.

Co-authored-by: Stephen Toub <[email protected]>
@lonitra lonitra merged commit 6d3f9b5 into main Nov 13, 2024
84 checks passed
@lonitra lonitra deleted the finalizerhang branch November 13, 2024 20:17
@lonitra
Copy link
Member Author

lonitra commented Nov 13, 2024

Breaking change issue: dotnet/docs#43563

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-Microsoft.Win32 breaking-change Issue or PR that represents a breaking API or functional change over a prerelease. needs-breaking-change-doc-created Breaking changes need an issue opened with https://github.com/dotnet/docs/issues/new?template=dotnet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Deadlock with SystemEvents when application is shutdown
8 participants