Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent Premature Thread Exiting #61

Merged
merged 3 commits into from
Apr 25, 2024
Merged

Prevent Premature Thread Exiting #61

merged 3 commits into from
Apr 25, 2024

Conversation

GabTux
Copy link
Contributor

@GabTux GabTux commented Apr 13, 2024

Hi Paul,

Firstly, thank you for the well-written C++ code; thread-pool is an excellent project!

I identified a potential issue where threads in the thread pool might exit prematurely under certain conditions, impacting performance and correctness.

The Scenario:

  • Tasks are submitted and processed by the thread pool.
  • When only one task remains, other threads detect an empty queue and join the main thread.
  • However, the last task might add new tasks.
  • These newly added tasks get processed by only a single thread (the last thread) since others already exited the thread pool.

See the added test case for a simple example.

Proposed Fix:

This merge request introduces an atomic counter to track the total number of tasks. The main thread sleeps on a binary semaphore, which will get signaled by the last thread if there are no other tasks in the thread pool. This ensures all tasks are completed before the main thread signals that all threads should exit.

@DeveloperPaul123
Copy link
Owner

Hello! 👋

Thanks for the PR

When only one task remains, other threads detect an empty queue and join the main thread.

This is not true unless the thread pool is being destructed. In normal situations, the threads that don't have work just wait for new work to come into the queue.

Is this the situation you're trying to address? Where the dtor of the thread_pool is being called but there is pending work and the last thread is adding more work? If that's the case, I'm not sure an additional counter would be needed. Can't we just reuse pending_tasks_ in this case?

Additionally, the unit test that was added is tripping address sanitizer:

==1696272==ERROR: AddressSanitizer: stack-use-after-scope on address 0x00a921afe1d0 at pc 0x7ff7b0f8ec6f bp 0x00a921efef60 sp 0x00a921efef68
READ of size 8 at 0x00a921afe1d0 thread T236
    #0 0x7ff7b0f8ec6e in `dp::thread_pool<std::move_only_function<void __cdecl(void)>,std::jthread>::enqueue_detach<`DOCTEST_ANON_FUNC_36'::`3'::<lambda_2> &,std::reference_wrapper<dp::thread_pool<std::move_only_function<void __cdecl(void)>,std::jthread> > >'::`2'::<lambda_1>::<lambda_1> E:\Repositories\thread-pool\include\thread_pool\thread_pool.h:203
    #1 0x7ff7b0fb2172 in dp::thread_pool<std::move_only_function<void __cdecl(void)>,std::jthread>::enqueue_detach<`DOCTEST_ANON_FUNC_36'::`3'::<lambda_2> &,std::reference_wrapper<dp::thread_pool<std::move_only_function<void __cdecl(void)>,std::jthread> > > E:\Repositories\thread-pool\include\thread_pool\thread_pool.h:195
    #2 0x7ff7b0f89cb7 in `DOCTEST_ANON_FUNC_36'::`3'::<lambda_3>::operator() E:\Repositories\thread-pool\test\source\thread_pool.cpp:375
    #3 0x7ff7b0fbaa54 in std::invoke<`DOCTEST_ANON_FUNC_36'::`3'::<lambda_3> &,std::reference_wrapper<dp::thread_pool<std::move_only_function<void __cdecl(void)>,std::jthread> > &> C:\Program Files\Microsoft Visual Studio\2022\Professional\VC\Tools\MSVC\14.39.33519\include\type_traits:1739 
    #4 0x7ff7b0f8edce in `dp::thread_pool<std::move_only_function<void __cdecl(void)>,std::jthread>::enqueue_detach<`DOCTEST_ANON_FUNC_36'::`3'::<lambda_3> &,std::reference_wrapper<dp::thread_pool<std::move_only_function<void __cdecl(void)>,std::jthread> > >'::`2'::<lambda_1>::operator() E:\Repositories\thread-pool\include\thread_pool\thread_pool.h:200
    #5 0x7ff7b0fba7c2 in std::invoke<`dp::thread_pool<std::move_only_function<void __cdecl(void)>,std::jthread>::enqueue_detach<`DOCTEST_ANON_FUNC_36'::`3'::<lambda_3> &,std::reference_wrapper<dp::thread_pool<std::move_only_function<void __cdecl(void)>,std::jthread> > >'::`2'::<lambda_1> &> C:\Program Files\Microsoft Visual Studio\2022\Professional\VC\Tools\MSVC\14.39.33519\include\type_traits:1729
    #6 0x7ff7b0fa356a in std::_Function_inv_small<`dp::thread_pool<std::move_only_function<void __cdecl(void)>,std::jthread>::enqueue_detach<`DOCTEST_ANON_FUNC_36'::`3'::<lambda_3> &,std::reference_wrapper<dp::thread_pool<std::move_only_function<void __cdecl(void)>,std::jthread> > >'::`2'::<lambda_1>,`dp::thread_pool<std::move_only_function<void __cdecl(void)>,std::jthread>::enqueue_detach<`DOCTEST_ANON_FUNC_36'::`3'::<lambda_3> &,std::reference_wrapper<dp::thread_pool<std::move_only_function<void __cdecl(void)>,std::jthread> > >'::`2'::<lambda_1> &,void,0> C:\Program Files\Microsoft Visual Studio\2022\Professional\VC\Tools\MSVC\14.39.33519\include\functional:1359
    #7 0x7ff7b0fd2a80 in std::_Move_only_function_call<(void)>::operator()(void) C:\Program Files\Microsoft Visual Studio\2022\Professional\VC\Tools\MSVC\14.39.33519\include\functional:1680
    #8 0x7ff7b0fbaaf2 in std::invoke<class std::move_only_function<(void)>>(class std::move_only_function<(void)> &&) C:\Program Files\Microsoft Visual Studio\2022\Professional\VC\Tools\MSVC\14.39.33519\include\type_traits:1729
    #9 0x7ff7b0fd1495 in `public: __cdecl dp::thread_pool<class std::move_only_function<void __cdecl(void)>, class std::jthread>::thread_pool<class std::move_only_function<void __cdecl(void)>, class std::jthread>(unsigned int const &)'::`5'::<lambda_1>::operator()(class std::stop_token const &) const E:\Repositories\thread-pool\include\thread_pool\thread_pool.h:54
    #10 0x7ff7b0fbaa9c in std::invoke<class `public: __cdecl dp::thread_pool<class std::move_only_function<void __cdecl(void)>, class std::jthread>::thread_pool<class std::move_only_function<void __cdecl(void)>, class std::jthread>(unsigned int const &)'::`5'::<lambda_1>, class std::stop_token>(class `public: __cdecl dp::thread_pool<class std::move_only_function<void __cdecl(void)>, class std::jthread>::thread_pool<class std::move_only_function<void __cdecl(void)>, class std::jthread>(unsigned int const &)'::`5'::<lambda_1> &&, class std::stop_token &&) C:\Program Files\Microsoft Visual Studio\2022\Professional\VC\Tools\MSVC\14.39.33519\include\type_traits:1739
    #11 0x7ff7b0fa6a05 in std::thread::_Invoke<class std::tuple<class `public: __cdecl dp::thread_pool<class std::move_only_function<void __cdecl(void)>, class std::jthread>::thread_pool<class std::move_only_function<void __cdecl(void)>, class std::jthread>(unsigned int const &)'::`5'::<lambda_1>, class std::stop_token>, 0, 1>(void *) C:\Program Files\Microsoft Visual Studio\2022\Professional\VC\Tools\MSVC\14.39.33519\include\thread:60
    #12 0x7ffa203d300f  (C:\Windows\SYSTEM32\ucrtbased.dll+0x1800b300f)
    #13 0x7ff9b7eaebde in __asan::AsanThread::ThreadStart(unsigned __int64) D:\a\_work\1\s\src\vctools\asan\llvm\compiler-rt\lib\asan\asan_thread.cpp:299
    #14 0x7ffa7c31257c  (C:\Windows\System32\KERNEL32.DLL+0x18001257c)
    #15 0x7ffa7c46aa47  (C:\Windows\SYSTEM32\ntdll.dll+0x18005aa47)

@GabTux
Copy link
Contributor Author

GabTux commented Apr 18, 2024

Is this the situation you're trying to address? Where the dtor of the thread_pool is being called but there is pending work and the last thread is adding more work?

Yes, that is almost the situation; sorry, I forgot to mention that the dtor is being called.

Here's a breakdown of the scenario:

  1. Tasks are added to the thread pool with enqueue_detach.
  2. The thread pool's destructor is called, signaling a stop request.
  3. All threads in the thread pool will start processing all added tasks, so the pending_tasks_ == 0.
  4. All threads except one will finish their tasks and exit because pending_tasks_ == 0 and stop was requested.
  5. Crucially, the last remaining thread will add new tasks to the pool.
  6. Since other threads have exited, only the last thread remains to process this newly added tasks, creating a bottleneck.

The test I added is simulating this situation.

If that's the case, I'm not sure an additional counter would be needed. Can't we just reuse pending_tasks_ in this case?

I don't see how we can reuse the pending_tasks_ counter to prevent this scenario. The pending_tasks_ can be 0, but there can be tasks that are still being processed and may generate new tasks.

Additionally, the unit test that was added is tripping address sanitizer:

Thanks, I fixed it in new commit.

@DeveloperPaul123
Copy link
Owner

Ok thanks for the clarification and fix for the unit test!

I don't see how we can reuse the pending_tasks_ counter to prevent this scenario. The pending_tasks_ can be 0, but there can be tasks that are still being processed and may generate new tasks.

Maybe I'm missing something, but the main difference between total_tasks_ and pending_tasks_ is where the value is decremented. pending_tasks_ is decremented before task invocation, whereas the total_tasks_ is decremented after the task is done. Couldn't we just use pending_tasks_ and just change where it is decremented to have the same effect?

I'm going to play around with you PR and see if I can make that happen. If so, your PR may also work nicely as the base for #23

@GabTux
Copy link
Contributor Author

GabTux commented Apr 18, 2024

Couldn't we just use pending_tasks_ and just change where it is decremented to have the same effect?

You are right; we can use just the pending_tasks_ counter. However, I was concerned about potential performance implications. If I understand the code correctly, the worker threads keep checking the queues to get a task if they are already awake (not sleeping on a binary semaphore), and pending_tasks_ is non-zero. If we decrement the pending_tasks_ after task invocation, it could lead to busy waiting and possible unnecessary CPU drainage.

Consider this scenario: all threads are awake, the queues are empty, but one thread is finishing a long-running task. The pending_tasks_ remains non-zero, causing unnecessary spinning.

That is why I chose two counters, but looking back, it does not seem like an elegant solution.

I'm going to play around with you PR and see if I can make that happen. If so, your PR may also work nicely as the base for #23

Sounds great, thanks!

@DeveloperPaul123
Copy link
Owner

@GabTux See #62 for my updates to your PR. I will likely close this PR in favor of that one. Maybe we can continue our discussion there.

In fact, if you are able to test that PR, I would really appreciate it.

@DeveloperPaul123 DeveloperPaul123 merged commit 8814340 into DeveloperPaul123:master Apr 25, 2024
3 of 4 checks passed
@DeveloperPaul123
Copy link
Owner

Welp I accidentally pushed this to my master 🤦‍♂️ But I intended to merge it anyway. I renamed the counter variables and added a new subcase for the unit test you added. Thanks again for the PR!

DeveloperPaul123 added a commit that referenced this pull request Apr 25, 2024
Added `wait_for_tasks()` feature, building on top of the work done in #61. This simply refactors some of that code into a public method that users can call to block the current thread and wait for all tasks to complete.
@GabTux
Copy link
Contributor Author

GabTux commented Apr 25, 2024

You are welcome. I agree that the names for the counter variables are more precise now. Thank you for merging my contribution!

@DeveloperPaul123
Copy link
Owner

Thank you for your PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants