-
Notifications
You must be signed in to change notification settings - Fork 195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible memory leak from NetCDFOutputWriter
#3777
Comments
Thanks for the x-ref with Alexander-Barth/NCDatasets.jl#266, let's keep an eye on how it develops over there. |
As @Yixiao-Zhang found, there is already a reproducer in pure C about this issue (Unidata/netcdf-c#2626). As it is specific to the NetCDF 4 format, it might also be in libhdf5 (Unidata/netcdf-c#2626 (comment)). I am wondering, if |
It seems to me that a quick fix is to downgrade |
NetCDFOutputWriter
NetCDFOutputWriter
I wish I left more info in PR #1229 haha but I think keeping the file open between writes was actually an oversight. My thinking was that you don't want to leave the file open when you don't explicitly need it to be open in case the simulation or Julia crashes. In some cases, this could corrupt the file or could leave the file unreadable by other processes until the Julia process exits. NetCDF files and NCDatasets.jl may not suffer from these issues, but I was probably doing it to be safe. If downgrading |
Yes, I think that this is probably due to memory fragmentation (generated by the memory leak in netcdf-c). Julia needs a contiguous block of memory for its arrays. Even if netcdf-c leaks just a couple of bytes, such contiguous large blocks will be more and more difficult to find. |
[deps]
NetCDF_jll = "7243133f-43d8-5620-bbf4-c2c921802cf3"
[compat]
NetCDF_jll = "<400.900" |
This kind of fix has indeed been discouraged in the past... Perhaps we need a new patch for NCDatasets that we can add compat for? |
Recently, many of my simulations that run on clusters have crashed due to out-of-memory errors. I find that
NetCDFOutputWriter
seems to cause memory leak, which can by reproduced by the code below:The total memory usage reported by
Sys.maxrss
keeps increasing over time, the rate which is roughly the output data size. ForcingGc.gc()
slows down the trend but cannot stop the increase.I believe it is a bug in
NCDatasets
. See Alexander-Barth/NCDatasets.jl#266. The version ofNCDatasets
is 0.14.5 in my case.The text was updated successfully, but these errors were encountered: