Skip to content

Commit

Permalink
Keep event thread after execve
Browse files Browse the repository at this point in the history
Currently when a thread from a thread group is doing execve, we expect
that the kernel will reassign its tid at the end to the group leader,
and simulate this behavior in the parser accordingly. The final result
is all the threads in the thread group, except the leader, are removed
from the cache.

But looks like under certain circumstances it's possible to end up in a
situation when the kernel is not doing the reassignment, yet the syscall
ends successfully. This leads to a crash, since the parser removes the
thread associated with the execve_x event, which will be accessed later
during post processing -- and everything is expose in use-after-free.
It's hard to reproduce artificially, but there are crash reports from
the field, demonstrating the problem and confirming the patch fixes the
crash. So far the issue was discovered only on ppc64le (Power10 to be
more precise).

To handle this, keep the event thread in place. Note, that tid here
comes from the BPF probe directly, where it's captured via
bpf_get_current_task/_btf. This means that the tid is the one really
reported by the kernel, so keeping it represents the current state
precisely.

Signed-off-by: Dmitrii Dolgov <[email protected]>
  • Loading branch information
erthalion authored and poiana committed Dec 30, 2024
1 parent 7b9e76f commit 615ecfb
Showing 1 changed file with 11 additions and 1 deletion.
12 changes: 11 additions & 1 deletion userspace/libsinsp/parsers.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2255,8 +2255,18 @@ void sinsp_parser::parse_execve_exit(sinsp_evt *evt) {
auto thread_ptr = thread.lock().get();
/* we don't want to remove the main thread since it is the one
* running in this parser!
*
* Also make sure the thread to be removed is not the one
* associated with the event. Under normal conditions this
* should not happen, since the kernel will reassign tid before
* returning from the exec syscall. But there are crash reports,
* indicating possibility the original tid is kept in place, but
* the syscall still returns a success.
*
* To handle such cases gracefully, keep the event thread.
*/
if(thread_ptr == nullptr || thread_ptr->is_main_thread()) {
if(thread_ptr == nullptr || thread_ptr->is_main_thread() ||
thread_ptr->m_tid == evt->get_tinfo()->m_tid) {
continue;
}
m_inspector->remove_thread(thread_ptr->m_tid);
Expand Down

0 comments on commit 615ecfb

Please sign in to comment.