-
Notifications
You must be signed in to change notification settings - Fork 986
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Massive GDI (region) leak. Help needed. #11334
Comments
@kirsan31 I'm very interested in looking more deeply at this. Unfortunately, I'm tied up for a number of weeks on critical When the other work is done, I can try to see what I might be able to find out. Also, have you been able to repro the same thing with .NET 9? |
I can’t reproduce this at all (no matter how hard I try). This only happens on a working machine and only during work. Moreover, leaks have never started immediately after launching the application, only after a few days. Therefore, my ability to experiment there is very limited and I cannot use .Net9 :( @weltkante sorry to bother you. but may be you have some ideas?
By the way, I have a question on this topic that no one has answered yet. |
No problem, unfortunately this is nothing I've come across in the past. So far I've always been able to rely on managed leaks and memory snapshots/dumps to compare, or being able to reproduce the problem locally and do a time travel debug trace for inspecting the unmanaged leaks. Seems like neither is an option for you. If I had to diagnose this issue I'd probably try to isolate what effects it:
|
Oh, and make sure the finalizer thread is not stuck on something (look at a few dumps in a debugger after leaks started and check that the thread is idle or at least differs between dumps). Depending on your tooling finalizable objects may not show up as leaks in your managed analysis, but if the finalizer thread is hanging and can't finalize things anymore that may end up this way. |
Nice case, just checked - everything is ok here. But the probability was extremely low because... A work PC has very high restrictions on installed software and Internet use.
Due to the specifics, there is no way for us to configure either a second physical machine or a virtual one. And users won’t approve of this, it’s easier for them to restart the application every few days :)
There's nothing wrong with that. Because other objects are finalized normally, and most regions too. Also in dumps after GC, ready for finalization objects are empty and nothing extraordinary in dead objects. Thank you for your attention any way 🙏 |
A small update on what I found out.
|
Just as a side note to avoid other people reading this drawing the wrong conclusions: SetWindowPos can trigger a lot messages, including callbacks into managed code, so its unlikely to be the direct cause of the leak |
Sounds weird, all WinForms controls should, at the very least, go through the managed message handler of the control. And SetWindowPos can (and usually does) trigger resize and redraw logic, both of which can have managed event handlers that need to be dispatched too, even if they are empty. Anyways, just meant to say that seeing this method as call root doesn't mean the problem is guaranteed to be on the native code. |
@kirsan31 Do you think you could provide consistent repro for us to investigate this? |
I hope so... Currently the issue still exist (very sporadically) and I can't get the root cause :( |
@kirsan31 as soon as we get actionable stuff here we can assign it to whatever the current release is. |
Once again we were able to directly catch leaks and use the performance HUD. What we found out:
This time I copied (in several approaches) all the stacks for two applications from the performance HUD (If necessary, I will provide them all). But they don’t give anything new - all the stacks are some kind of drawing of a menu/tooltip, etc., which always end with leaked.mp4From all this and the fact that before .Net7 such behavior was not observed, I have only two possible assumptions - either this is somehow a Windows bug/corruption (appeared with some kind of system update), or, after all, a regression in .Net. Does anyone have any other ideas or tips? P.S. why do all messages go through office component |
Thats just an interface for Office/VisualStudio compatibility, the naming is historical, WinForms uses its own implementation if Office/VS is not detected to provide the interface implementation. |
Note that this will now be turned off by default (.NET 9). It can be turned back on with the "Switch.System.Windows.Forms.EnableMsoComponentManager" switch. Even with the stub it was a fair amount of overhead for message processing. As we had to rewrite all of our COM for ComWrappers we took the opportunity to simplify the message loop. |
.NET version
.Net7 and .Net8
Did it work in .NET Framework?
Yes
Did it work in any of the earlier releases of .NET Core or .NET 5+?
We didn't see these problems before .Net7.
Issue description
For several month we are trying to investigate huge GDI (regions) leak in our app. This leak is critical because can reach GDI limit (10k) in one day.
*.hudinsight
) does not show the names of the methods when viewed (may be some one knew how to overcome this?) :( And even those copied manually as text also turned out to be cropped due to the large size. Therefore, I will present here what I managed to get (sorry for this).479 leaked regions due to redrawing (almost all are system calls):
1.csv
84 leaked regions due to close all opened child mdi forms. Closing all windows is done through the menu, when opened, the child menu is filled with open forms (15 in our case). Big part of it is leaking in
MenuStrip
->Control.SetBoundsCore
->SetWindowPos
. Call stacksToolStripDropDownItem.OnDropDownOpened
->EtwWriteTransfer
andToolStripDropDownItem.OnDropDownClosed
->EtwWriteTransfer
are full.2.csv
The logic in
tsmiWidow_DropDownOpened
is populate childeDropDownItems
with 15 (in this situation) items.The logic in
tsmiWidow_DropDownClosed
is clear all items previously added:There are no managed leaks, all objects were properly deleted (this is not always 100% true I explain below).
It is very strange that the leak occurs both when adding and removing elements. In 99% of cases everything works completely correctly.
While researching I found a small managed leak here:
This small managed leak is reproducible and can't lead to such catastrophic consequences. Can easily be fixed with
WeakReference
here (I will open a PR later):winforms/src/System.Windows.Forms/src/System/Windows/Forms/Input/MouseHoverTimer.cs
Lines 10 to 11 in 7504692
all.csv
OS: Windows 10 Pro for Workstations 22H2.
In conclusion, it seems to me that the problem is somewhere in Winforms, or even in the OS. Any assistance in further investigation is greatly appreciated. 🙏
Steps to reproduce
--
The text was updated successfully, but these errors were encountered: