Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Current HPX master causes segfaults within Octo-Tiger #6414

Closed
G-071 opened this issue Jan 12, 2024 · 1 comment · Fixed by #6415
Closed

Current HPX master causes segfaults within Octo-Tiger #6414

G-071 opened this issue Jan 12, 2024 · 1 comment · Fixed by #6415

Comments

@G-071
Copy link
Member

G-071 commented Jan 12, 2024

Octo-Tiger currently does not work when using the current HPX master. We get a segfault, usually at the beginning of the application (for instance when loading of options or during the first timestep -- it varies). Note that this does not always happen and some few runs execute normally.

This does not seem to be a racing condition as I am able to reproduce this with --hpx:threads=1.

I did some digging when this problem started to occur. HPX 1.9.1 still works fine, of course! The problem seems to start occuring after #6050 was merged: I can reproduce the problem with 962288c , while the commit just before that ( 8edd9b3 ) still works without any issues.

To reproduce:

  • I used a basic Octo-Tiger (master) build without CUDA / Kokkos / Networking on my laptop
  • GCC/12
  • Scenario ./build/octotiger/build/octotiger --hpx:threads=1 --config_file=src/octotiger/test_problems/rotating_star/rotating_star.ini --unigrid=0 --cuda_number_gpus=1 --max_kernels_fused=1 --stop_step=10 --max_gpu_executor_queue_length=5 --max_level=3 --correct_am_hydro=0 --monopole_host_kernel_type=LEGACY --multipole_host_kernel_type=LEGACY --theta=0.5 --monopole_device_kernel_type=OFF --multipole_device_kernel_type=OFF --hydro_device_kernel_type=OFF --hydro_host_kernel_type=LEGACY --amr_boundary_kernel_type=AMR_OPTIMIZED --disable_output=1

The backtrace of the segfault is:

#0  0x00007ffff71fc801 in hpx::util::extra_data_node::get<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > (this=0x8) at /home/daissgr/workshop/KokkosTiger/src/hpx/libs/core/type_support/include/hpx/type_support/extra_data.hpp:130
#1  hpx::util::extra_data::try_get<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > (this=0x7fffd000f8b0) at /home/daissgr/workshop/KokkosTiger/src/hpx/libs/core/type_support/include/hpx/type_support/extra_data.hpp:164
#2  hpx::lcos::detail::future_data<hpx::id_type>::try_get_extra_data<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > (this=0x7fffd000f840) at /home/daissgr/workshop/KokkosTiger/src/hpx/libs/full/components/include/hpx/components/client_base.hpp:246
#3  hpx::lcos::detail::future_data<hpx::id_type>::tidy (this=this@entry=0x7fffd000f840) at /home/daissgr/workshop/KokkosTiger/src/hpx/libs/full/components/src/client_base.cpp:48
#4  0x00007ffff710b6de in hpx::lcos::detail::future_data<hpx::id_type>::~future_data (this=0x7fffd000f840, __in_chrg=<optimized out>) at /home/daissgr/workshop/KokkosTiger/src/hpx/libs/full/components/include/hpx/components/client_base.hpp:226
#5  hpx::lcos::detail::future_data_allocator<hpx::id_type, hpx::util::thread_local_caching_allocator<char, std::allocator<char> >, void>::~future_data_allocator (this=0x7fffd000f840, __in_chrg=<optimized out>) at /home/daissgr/workshop/KokkosTiger/src/hpx/libs/core/futures/include/hpx/futures/detail/future_data.hpp:730
#6  0x00007ffff71193f0 in std::__new_allocator<hpx::lcos::detail::future_data_allocator<hpx::id_type, hpx::util::thread_local_caching_allocator<char, std::allocator<char> >, void> >::destroy<hpx::lcos::detail::future_data_allocator<hpx::id_type, hpx::util::thread_local_caching_allocator<char, std::allocator<char> >, void> > (__p=<optimized out>, this=<optimized out>) at /usr/include/c++/12/bits/new_allocator.h:181
#7  std::allocator_traits<std::allocator<hpx::lcos::detail::future_data_allocator<hpx::id_type, hpx::util::thread_local_caching_allocator<char, std::allocator<char> >, void> > >::destroy<hpx::lcos::detail::future_data_allocator<hpx::id_type, hpx::util::thread_local_caching_allocator<char, std::allocator<char> >, void> > (__p=<optimized out>, __a=...) at /usr/include/c++/12/bits/alloc_traits.h:535
#8  hpx::util::thread_local_caching_allocator<hpx::lcos::detail::future_data_allocator<hpx::id_type, hpx::util::thread_local_caching_allocator<char, std::allocator<char> >, void>, std::allocator<hpx::lcos::detail::future_data_allocator<hpx::id_type, hpx::util::thread_local_caching_allocator<char, std::allocator<char> >, void> > >::destroy<hpx::lcos::detail::future_data_allocator<hpx::id_type, hpx::util::thread_local_caching_allocator<char, std::allocator<char> >, void> > (p=<optimized out>, this=<optimized out>) at /home/daissgr/workshop/KokkosTiger/src/hpx/libs/core/allocator_support/include/hpx/allocator_support/thread_local_caching_allocator.hpp:183
#9  std::allocator_traits<hpx::util::thread_local_caching_allocator<hpx::lcos::detail::future_data_allocator<hpx::id_type, hpx::util::thread_local_caching_allocator<char, std::allocator<char> >, void>, std::allocator<hpx::lcos::detail::future_data_allocator<hpx::id_type, hpx::util::thread_local_caching_allocator<char, std::allocator<char> >, void> > > >::_S_destroy<hpx::util::thread_local_caching_allocator<hpx::lcos::detail::future_data_allocator<hpx::id_type, hpx::util::thread_local_caching_allocator<char, std::allocator<char> >, void>, std::allocator<hpx::lcos::detail::future_data_allocator<hpx::id_type, hpx::util::thread_local_caching_allocator<char, std::allocator<char> >, void> > >, hpx::lcos::detail::future_data_allocator<hpx::id_type, hpx::util::thread_local_caching_allocator<char, std::allocator<char> >, void> > (__p=<optimized out>, __a=...) at /usr/include/c++/12/bits/alloc_traits.h:272
#10 std::allocator_traits<hpx::util::thread_local_caching_allocator<hpx::lcos::detail::future_data_allocator<hpx::id_type, hpx::util::thread_local_caching_allocator<char, std::allocator<char> >, void>, std::allocator<hpx::lcos::detail::future_data_allocator<hpx::id_type, hpx::util::thread_local_caching_allocator<char, std::allocator<char> >, void> > > >::destroy<hpx::lcos::detail::future_data_allocator<hpx::id_type, hpx::util::thread_local_caching_allocator<char, std::allocator<char> >, void> > (__p=<optimized out>, __a=...) at /usr/include/c++/12/bits/alloc_traits.h:378
#11 hpx::lcos::detail::future_data_allocator<hpx::id_type, hpx::util::thread_local_caching_allocator<char, std::allocator<char> >, void>::destroy (this=0x7fffd000f840) at /home/daissgr/workshop/KokkosTiger/src/hpx/libs/core/futures/include/hpx/futures/detail/future_data.hpp:777
#12 0x00007ffff78c0f56 in hpx::lcos::detail::intrusive_ptr_release (p=<optimized out>) at /home/daissgr/workshop/KokkosTiger/build/hpx/include/hpx/futures/detail/future_data.hpp:145
#13 hpx::lcos::detail::intrusive_ptr_release (p=<optimized out>) at /home/daissgr/workshop/KokkosTiger/build/hpx/include/hpx/futures/detail/future_data.hpp:141
#14 hpx::intrusive_ptr<hpx::lcos::detail::future_data_base<hpx::id_type> >::~intrusive_ptr (this=<optimized out>, __in_chrg=<optimized out>) at /home/daissgr/workshop/KokkosTiger/build/hpx/include/hpx/memory/intrusive_ptr.hpp:88
#15 hpx::intrusive_ptr<hpx::lcos::detail::future_data_base<hpx::id_type> >::reset (this=0x7ffff6294e08) at /home/daissgr/workshop/KokkosTiger/build/hpx/include/hpx/memory/intrusive_ptr.hpp:152
#16 hpx::future<hpx::id_type>::invalidate::~invalidate (this=0x7ffff6294d38, __in_chrg=<optimized out>) at /home/daissgr/workshop/KokkosTiger/build/hpx/include/hpx/futures/future.hpp:560
#17 hpx::future<hpx::id_type>::get (this=0x7ffff6294e08) at /home/daissgr/workshop/KokkosTiger/build/hpx/include/hpx/futures/future.hpp:698
#18 0x00007ffff78eee02 in operator() (__closure=<optimized out>, fut=...) at /home/daissgr/workshop/KokkosTiger/src/octotiger/src/node_server_actions_2.cpp:641
#19 operator() (__closure=<optimized out>) at /home/daissgr/workshop/KokkosTiger/build/hpx/include/hpx/futures/packaged_continuation.hpp:69
#20 hpx::detail::try_catch_exception_ptr<hpx::lcos::detail::invoke_continuation_nounwrap<node_client::get_ptr() const::<lambda(future<long unsigned int>&&)>, hpx::future<long unsigned int>, continuation<hpx::future<long unsigned int>, node_client::get_ptr() const::<lambda(future<long unsigned int>&&)>, node_server*> >(node_client::get_ptr() const::<lambda(future<long unsigned int>&&)>&, hpx::future<long unsigned int>&&, continuation<hpx::future<long unsigned int>, node_client::get_ptr() const::<lambda(future<long unsigned int>&&)>, node_server*>&)::<lambda()>, hpx::lcos::detail::invoke_continuation_nounwrap<node_client::get_ptr() const::<lambda(future<long unsigned int>&&)>, hpx::future<long unsigned int>, continuation<hpx::future<long unsigned int>, node_client::get_ptr() const::<lambda(future<long unsigned int>&&)>, node_server*> >(node_client::get_ptr() const::<lambda(future<long unsigned int>&&)>&, hpx::future<long unsigned int>&&, continuation<hpx::future<long unsigned int>, node_client::get_ptr() const::<lambda(future<long unsigned int>&&)>, node_server*>&)::<lambda(std::__exception_ptr::exception_ptr)> > (c=..., t=...) at /home/daissgr/workshop/KokkosTiger/build/hpx/include/hpx/errors/try_catch_exception_ptr.hpp:37
#21 hpx::lcos::detail::invoke_continuation_nounwrap<node_client::get_ptr() const::<lambda(future<long unsigned int>&&)>, hpx::future<long unsigned int>, hpx::lcos::detail::continuation<hpx::future<long unsigned int>, node_client::get_ptr() const::<lambda(future<long unsigned int>&&)>, node_server*> >(struct {...} &, hpx::future<unsigned long> &&, hpx::lcos::detail::continuation<hpx::future<long unsigned int>, node_client::get_ptr() const::<lambda(future<long unsigned int>&&)>, node_server*> &) (func=..., future=..., cont=warning: RTTI symbol not found for class 'hpx::lcos::detail::continuation_allocator<hpx::util::thread_local_caching_allocator<char, std::allocator<char> >, hpx::future<unsigned long>, node_client::get_ptr() const::{lambda(hpx::future<unsigned long>&&)#1}, node_server*>'
...) at /home/daissgr/workshop/KokkosTiger/build/hpx/include/hpx/futures/packaged_continuation.hpp:57
#22 0x00007ffff78e6ece in hpx::lcos::detail::invoke_continuation<node_client::get_ptr() const::<lambda(future<long unsigned int>&&)>, hpx::future<long unsigned int>, hpx::lcos::detail::continuation<hpx::future<long unsigned int>, node_client::get_ptr() const::<lambda(future<long unsigned int>&&)>, node_server*> > (cont=..., future=..., func=...) at /home/daissgr/workshop/KokkosTiger/build/hpx/include/hpx/futures/packaged_continuation.hpp:85
#23 hpx::lcos::detail::continuation<hpx::future<long unsigned int>, node_client::get_ptr() const::<lambda(future<long unsigned int>&&)>, node_server*>::run_impl<true> (f=..., this=<optimized out>) at /home/daissgr/workshop/KokkosTiger/build/hpx/include/hpx/futures/packaged_continuation.hpp:215
#24 operator() (__closure=<optimized out>) at /home/daissgr/workshop/KokkosTiger/build/hpx/include/hpx/futures/packaged_continuation.hpp:242
#25 hpx::threads::detail::thread_function_nullary<hpx::lcos::detail::continuation<hpx::future<long unsigned int>, node_client::get_ptr() const::<lambda(future<long unsigned int>&&)>, node_server*>::async<true, hpx::lcos::detail::post_policy_spawner>(hpx::traits::detail::shared_state_ptr_for_t<hpx::future<long unsigned int> >&&, hpx::lcos::detail::post_policy_spawner&&)::<lambda()> >::operator() (this=<optimized out>) at /home/daissgr/workshop/KokkosTiger/build/hpx/include/hpx/threading_base/register_thread.hpp:50
#26 hpx::util::detail::callable_vtable<std::pair<hpx::threads::thread_schedule_state, hpx::threads::thread_id>(hpx::threads::thread_restart_state)>::_invoke<hpx::threads::detail::thread_function_nullary<hpx::lcos::detail::continuation<hpx::future<long unsigned int>, node_client::get_ptr() const::<lambda(future<long unsigned int>&&)>, node_server*>::async<true, hpx::lcos::detail::post_policy_spawner>(hpx::traits::detail::shared_state_ptr_for_t<hpx::future<long unsigned int> >&&, hpx::lcos::detail::post_policy_spawner&&)::<lambda()> > >(void *, hpx::threads::thread_restart_state &&) (f=<optimized out>, vs#0=<optimized out>) at /home/daissgr/workshop/KokkosTiger/build/hpx/include/hpx/functional/detail/vtable/callable_vtable.hpp:88
#27 0x00007ffff6d31f49 in hpx::util::detail::basic_function<std::pair<hpx::threads::thread_schedule_state, hpx::threads::thread_id> (hpx::threads::thread_restart_state), false, false>::operator()(hpx::threads::thread_restart_state) const (vs#0=<optimized out>, this=0x7fffa400cea8) at /home/daissgr/workshop/KokkosTiger/src/hpx/libs/core/functional/include/hpx/functional/detail/basic_function.hpp:236
#28 hpx::threads::coroutines::detail::coroutine_impl::operator() (this=0x7fffa400ce40) at /home/daissgr/workshop/KokkosTiger/src/hpx/libs/core/coroutines/src/detail/coroutine_impl.cpp:77
#29 0x00007ffff6d313af in hpx::threads::coroutines::detail::lx::trampoline<hpx::threads::coroutines::detail::coroutine_impl> (fun=<optimized out>) at /home/daissgr/workshop/KokkosTiger/src/hpx/libs/core/coroutines/include/hpx/coroutines/detail/context_linux_x86.hpp:181
#30 0x0000000000000000 in ?? ()

@hkaiser: Do you have any idea what could cause this?

@hkaiser
Copy link
Member

hkaiser commented Jan 13, 2024

Do you have any idea what could cause this?

This is a use after move problem. I created a PR (#6415), please try if this fixes the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants