Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDF5 Closing: uninitialised byte(s) #28

Closed
ax3l opened this issue Feb 5, 2018 · 4 comments
Closed

HDF5 Closing: uninitialised byte(s) #28

ax3l opened this issue Feb 5, 2018 · 4 comments

Comments

@ax3l
Copy link
Member

ax3l commented Feb 5, 2018

Running valgrind on the SerialIOTests reviles uninitialised bytes in HDF5 close.

Maybe check what's given into HDF5IOHandlerImpl::~HDF5IOHandlerImpl() in file HDF5IOHandler.cpp:69?

Note: I did run the binary without downloaded sample files.

valgrind bin/SerialIOTests

# [...]
==14254== Syscall param write(buf) points to uninitialised byte(s)
==14254==    at 0x6E75190: __write_nocancel (syscall-template.S:84)
==14254==    by 0x5B2DA35: ??? (in /usr/lib/x86_64-linux-gnu/libhdf5_openmpi.so.100.0.1)
==14254==    by 0x5B277CB: H5FD_write (in /usr/lib/x86_64-linux-gnu/libhdf5_openmpi.so.100.0.1)
==14254==    by 0x5B0E94F: H5F__accum_flush (in /usr/lib/x86_64-linux-gnu/libhdf5_openmpi.so.100.0.1)
==14254==    by 0x5B0A15B: H5F_flush (in /usr/lib/x86_64-linux-gnu/libhdf5_openmpi.so.100.0.1)
==14254==    by 0x5B0A82A: H5F_dest (in /usr/lib/x86_64-linux-gnu/libhdf5_openmpi.so.100.0.1)
==14254==    by 0x5B0AF2C: H5F_try_close (in /usr/lib/x86_64-linux-gnu/libhdf5_openmpi.so.100.0.1)
==14254==    by 0x5B0B180: H5F_close (in /usr/lib/x86_64-linux-gnu/libhdf5_openmpi.so.100.0.1)
==14254==    by 0x5B83BEE: H5I_dec_ref (in /usr/lib/x86_64-linux-gnu/libhdf5_openmpi.so.100.0.1)
==14254==    by 0x5B83D7B: H5I_dec_app_ref (in /usr/lib/x86_64-linux-gnu/libhdf5_openmpi.so.100.0.1)
==14254==    by 0x5B05898: H5Fclose (in /usr/lib/x86_64-linux-gnu/libhdf5_openmpi.so.100.0.1)
==14254==    by 0x5D526A: HDF5IOHandlerImpl::~HDF5IOHandlerImpl() (HDF5IOHandler.cpp:69)
==14254==  Address 0x8bb3d4a is 2,490 bytes inside a block of size 4,104 alloc'd
==14254==    at 0x4C2BBAF: malloc (vg_replace_malloc.c:299)
==14254==    by 0x5B3170C: H5FL_blk_malloc (in /usr/lib/x86_64-linux-gnu/libhdf5_openmpi.so.100.0.1)
==14254==    by 0x5B322F3: H5FL_blk_realloc (in /usr/lib/x86_64-linux-gnu/libhdf5_openmpi.so.100.0.1)
==14254==    by 0x5B0DAB4: ??? (in /usr/lib/x86_64-linux-gnu/libhdf5_openmpi.so.100.0.1)
==14254==    by 0x5B0F35F: H5F__accum_write (in /usr/lib/x86_64-linux-gnu/libhdf5_openmpi.so.100.0.1)
==14254==    by 0x5B12509: H5F_block_write (in /usr/lib/x86_64-linux-gnu/libhdf5_openmpi.so.100.0.1)
==14254==    by 0x5AADDE7: H5C__flush_single_entry (in /usr/lib/x86_64-linux-gnu/libhdf5_openmpi.so.100.0.1)
==14254==    by 0x5AAEF24: H5C_flush_cache (in /usr/lib/x86_64-linux-gnu/libhdf5_openmpi.so.100.0.1)
==14254==    by 0x5A86F1C: H5AC_flush (in /usr/lib/x86_64-linux-gnu/libhdf5_openmpi.so.100.0.1)
==14254==    by 0x5B0A100: H5F_flush (in /usr/lib/x86_64-linux-gnu/libhdf5_openmpi.so.100.0.1)
==14254==    by 0x5B0A82A: H5F_dest (in /usr/lib/x86_64-linux-gnu/libhdf5_openmpi.so.100.0.1)
==14254==    by 0x5B0AF2C: H5F_try_close (in /usr/lib/x86_64-linux-gnu/libhdf5_openmpi.so.100.0.1)
==14254==
# [...]
@C0nsultant
Copy link
Member

Without looking very deep into this yet:
The argument passed to H5Fclose are all hid_t open file handles of files created with or opened by the HDF5 backend (lines 183 and 415 respectively).

At first glance I don't see what could be causing memory issues. If files don't exist on disk, the H5Fopen will never complete, meaning no corresonding handle will even be created. One sceanrio I have not considered yet is when an open handle is opened again without closing the first.

@ax3l
Copy link
Member Author

ax3l commented May 2, 2018

Maybe related to #143 ? Will verify again after this is implemented.

@ax3l
Copy link
Member Author

ax3l commented May 19, 2018

All examples look clean in valgrind, but

  • 4_read_parallel (H5Fopen)
  • SerialIOTests (H5Fclose, H5F_try_close)

show the interesting uninitialised bytes warnings. Not sure if we might forget to close some HDF5 handles in those or if it's a HDF5 internal issue.

In SerialIOTests the issue is caused by hdf5_dtype_test:

valgrind ./SerialIOTests hdf5_dtype_test

in lines

long double l = 1.e80L;
s.setAttribute("longdouble", l);

@ax3l
Copy link
Member Author

ax3l commented Apr 23, 2021

This should be fixed since we run UBSAN and ASAN for a while.

Also, valgrind and long double is a general issue: ornladios/ADIOS#184 (comment)

Another recently fixed issue in HDF5 types: #962

@ax3l ax3l closed this as completed Apr 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants