-
Notifications
You must be signed in to change notification settings - Fork 671
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Race conditions in MirrorSample when processing is multithreaded #1072
Comments
By File handle we mean Regarding this bug, we are seeing 0017 being cleanup with Could you try to change and rebuild those lines https://github.com/dokan-dev/dokany/blob/master/dokan/dokan.c#L883-L884 by
My guess is Close is called by |
I tried that, it doesn't fix it unfortunately. |
Can you add a print of the value being exchanged in the code I provided and share new logs to see if it is called before ? |
Dokan_concurrency_bug_log2.txt Uploaded a new log file.
and
Hope that helps! |
Dokan_concurrency_bug_log2_driver.txt Added another log with driver logs enabled. |
Thanks for the logs @zedmilner this is very helpful. Can you retry with 1ffacd4 ? |
Thanks for the quick reply! I still get the same crash with your latest CL. I tried it with your changes only, and after confirming I still get the same crash, I applied the same log changes I did before, and here is a new log: Dokan_concurrency_bug_log3.txt I couldn't get the crash with driver logs enabled this time, although I didn't try for too long. These are the last few events. My limited understanding is: we first call
Also, not sure if this is relevant but we execute the Thanks very much! |
All eventID have their own UserContext that is set during CreateFile. There is no Close / Cleanup at the same time for the same context here.
The problem is Windows is canceling / not waiting for GetFileInfo in this case and force the resources to be released by calling Cleanup and Close. I don't think the library can really do much here but here is some solutions:
Please let me know what you think and if anyone is reading this and has some push back to implement point 2, please speak up! |
Dokan_concurrency_bug_log3b.txt Thanks again for the very quick iteration on this. I uploaded a version of the previous log where I inserted a line number for each line, to make it easier to see what is where. The last instance of the first line in your extract
is at line 2041.
The last instance of the line after your comment
is at line 900
and the line after
is at line 906
And the line after
is at line 3574.
It is not easy to see from the log, but the full log is the result of me manually running fbuild.exe from the command line multiple times until I get the crash, eg:
If I search for eventID = 0012, These are the lines in the log:
All Cleanup events: eventID = 12 is at line 895
All Close file handle events. Notice that eventID = 0012 is at line 3574, a lot later than its Cleanup call.
All references to eventID = 0012
It looks unintuitive why GetFileInfo are called twice (ln 2043 and 3069) way after Cleanup (ln 895). eventID = 0041 looks simliar to eventID = 0012. It has the same series of calls to GetFileInfo and GetFileSecurity. Note that we only have Read file handle calls for eventID = 0012, my theory is the windows cache only reads the contents of fbuild.exe once, and the subsequent runs it only checks the file info but serves the contents from the cache. Sorry if I'm clearly on a completely wrong path here, I don't really have any experience with Dokan or file system drivers at all. Just trying to understand what the log shows :) Thanks |
What you are describing is out of Dokan control. They are external (Kernel/Apps) requests and we cannot reorder them. We could return an error but that might have another impact. That said, my comment last points #1072 (comment) are still valid. |
I have a multiprocess/multithreaded application running on code adapted from the mirror sample (remote FS with local cache), and I think there's a race condition in for example |
Closing this as the original question was answered here #1072 (comment) |
Feature request can skip this form. Bug report must complete it.
Check List
must be 100% match or it will be automatically closed without further discussion. Please remove this line.Environment
Check List
Description
Repro steps:
Logs
See the last few lines of the attached log.
Dokan_concurrency_bug_log1.txt
Note at Close file handle it shows eventID = 0017 which is a very old event, that was used a lot before, during my first (successful) run of fbuild.exe. Seeing that eventID again here seems wrong.
Also note I made a few changes to the logging code to actually log the correct file handle. This is not the cause of the above error, it happens with or without logging eg:
Please let me know what else I can help.
Thanks,
Zed
The text was updated successfully, but these errors were encountered: