Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(userspace): solving batch of recent regressions #1524

Merged
merged 3 commits into from
Nov 29, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions userspace/libscap/linux/scap_procs.c
Original file line number Diff line number Diff line change
Expand Up @@ -585,15 +585,15 @@ static int32_t scap_proc_add_from_proc(struct scap_linux_platform* linux_platfor
f = fopen(filename, "r");
if(f == NULL)
{
return SCAP_SUCCESS;
return scap_errprintf(error, errno, "can't find valid proc dir in %s", dir_name);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the past, returning SCAP_SUCCESS was fair enough, since the actual result thread info allocation happened below and the caller could still check for a NULL out thread info. Now, we can’t rely on the NULL result, since the caller is responsible of allocating it. As a consequence, for invalid thread IDs we always returned bogus scap thread infos. So IMO this is easily fixable by returning SCAP_FAILURE there, since they are actual failure scenarios.

}

ASSERT(sizeof(line) >= SCAP_MAX_PATH_SIZE);

if(fgets(line, SCAP_MAX_PATH_SIZE, f) == NULL)
{
fclose(f);
return SCAP_SUCCESS;
return scap_errprintf(error, errno, "can't read cmdline file %s", filename);
}
else
{
Expand Down
4 changes: 2 additions & 2 deletions userspace/libsinsp/container.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ bool sinsp_container_manager::remove_inactive_containers()
});

auto containers = m_containers.lock();
if (m_inspector->m_sinsp_stats_v2)
if (m_inspector != nullptr && m_inspector->m_sinsp_stats_v2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In line 69 we dereference the inspector 🤔

	if(m_inspector->m_lastevent_ts >
		m_last_flush_time_ns + m_inspector->m_inactive_container_scan_time_ns)
		m_last_flush_time_ns + m_inspector->m_inactive_container_scan_time_ns)

Not sure if we want to add an extra check also above of remove these checks like before

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is also used at line 66

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to check for m_inspector != nullptr before dereferencing the stats_v2 buffer. I was able to reproduce segfault in some common legit code paths. cc @incertum

Copy link
Contributor

@incertum incertum Nov 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @jasondellaluce yes in that original PR we had discussions about checking m_inspector but it was wanted to not do so for the most part. Glad you checked on this one instance where we should have kept it ❤️ !

{
m_inspector->m_sinsp_stats_v2->m_n_missing_container_images = 0;
// Will include pod sanboxes, but that's ok
Expand All @@ -97,7 +97,7 @@ bool sinsp_container_manager::remove_inactive_containers()
for(auto it = containers->begin(); it != containers->end();)
{
sinsp_container_info::ptr_t container = it->second;
if (m_inspector->m_sinsp_stats_v2)
if (m_inspector != nullptr && m_inspector->m_sinsp_stats_v2)
{
auto container_info = container.get();
if (!container_info || (container_info && !container_info->m_is_pod_sandbox && container_info->m_image.empty()))
Expand Down
12 changes: 6 additions & 6 deletions userspace/libsinsp/fdinfo.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -286,7 +286,7 @@ sinsp_fdinfo_t* sinsp_fdtable::find(int64_t fd)
//
if(m_last_accessed_fd != -1 && fd == m_last_accessed_fd)
{
if (m_inspector->m_sinsp_stats_v2)
if (m_inspector != nullptr && m_inspector->m_sinsp_stats_v2)
{
m_inspector->m_sinsp_stats_v2->m_n_cached_fd_lookups++;
}
Expand All @@ -300,15 +300,15 @@ sinsp_fdinfo_t* sinsp_fdtable::find(int64_t fd)

if(fdit == m_table.end())
{
if (m_inspector->m_sinsp_stats_v2)
if (m_inspector != nullptr && m_inspector->m_sinsp_stats_v2)
{
m_inspector->m_sinsp_stats_v2->m_n_failed_fd_lookups++;
}
return NULL;
}
else
{
if (m_inspector->m_sinsp_stats_v2)
if (m_inspector != nullptr && m_inspector->m_sinsp_stats_v2)
{
m_inspector->m_sinsp_stats_v2->m_n_noncached_fd_lookups++;
}
Expand Down Expand Up @@ -340,7 +340,7 @@ sinsp_fdinfo_t* sinsp_fdtable::add(int64_t fd, sinsp_fdinfo_t* fdinfo)
// No entry in the table, this is the normal case
//
m_last_accessed_fd = -1;
if (m_inspector->m_sinsp_stats_v2)
if (m_inspector != nullptr && m_inspector->m_sinsp_stats_v2)
{
m_inspector->m_sinsp_stats_v2->m_n_added_fds++;
}
Expand Down Expand Up @@ -412,15 +412,15 @@ void sinsp_fdtable::erase(int64_t fd)
// keep going.
//
ASSERT(false);
if (m_inspector->m_sinsp_stats_v2)
if (m_inspector != nullptr && m_inspector->m_sinsp_stats_v2)
{
m_inspector->m_sinsp_stats_v2->m_n_failed_fd_lookups++;
}
}
else
{
m_table.erase(fdit);
if (m_inspector->m_sinsp_stats_v2)
if (m_inspector != nullptr && m_inspector->m_sinsp_stats_v2)
{
m_inspector->m_sinsp_stats_v2->m_n_noncached_fd_lookups++;
m_inspector->m_sinsp_stats_v2->m_n_removed_fds++;
Expand Down
18 changes: 9 additions & 9 deletions userspace/libsinsp/parsers.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -677,7 +677,7 @@ bool sinsp_parser::reset(sinsp_evt *evt)
etype == PPME_SYSCALL_VFORK_20_X ||
etype == PPME_SYSCALL_CLONE3_X)
{
if (m_inspector->m_sinsp_stats_v2)
if (m_inspector != nullptr && m_inspector->m_sinsp_stats_v2)
{
m_inspector->m_sinsp_stats_v2->m_n_failed_thread_lookups--;
}
Expand Down Expand Up @@ -826,7 +826,7 @@ void sinsp_parser::store_event(sinsp_evt *evt)
// we won't be able to parse the corresponding exit event and we'll have
// to drop the information it carries.
//
if (m_inspector->m_sinsp_stats_v2)
if (m_inspector != nullptr && m_inspector->m_sinsp_stats_v2)
{
m_inspector->m_sinsp_stats_v2->m_n_store_evts_drops++;
}
Expand Down Expand Up @@ -862,7 +862,7 @@ void sinsp_parser::store_event(sinsp_evt *evt)
memcpy(tinfo->m_lastevent_data, evt->m_pevt, elen);
tinfo->m_lastevent_cpuid = evt->get_cpuid();

if (m_inspector->m_sinsp_stats_v2)
if (m_inspector != nullptr && m_inspector->m_sinsp_stats_v2)
{
m_inspector->m_sinsp_stats_v2->m_n_stored_evts++;
}
Expand All @@ -887,7 +887,7 @@ bool sinsp_parser::retrieve_enter_event(sinsp_evt *enter_evt, sinsp_evt *exit_ev
// This happen especially at the beginning of trace files, where events
// can be truncated
//
if (m_inspector->m_sinsp_stats_v2)
if (m_inspector != nullptr && m_inspector->m_sinsp_stats_v2)
{
m_inspector->m_sinsp_stats_v2->m_n_retrieve_evts_drops++;
}
Expand All @@ -910,7 +910,7 @@ bool sinsp_parser::retrieve_enter_event(sinsp_evt *enter_evt, sinsp_evt *exit_ev
&&
enter_evt->get_type() == PPME_SYSCALL_EXECVEAT_E)
{
if (m_inspector->m_sinsp_stats_v2)
if (m_inspector != nullptr && m_inspector->m_sinsp_stats_v2)
{
m_inspector->m_sinsp_stats_v2->m_n_retrieved_evts++;
}
Expand All @@ -925,13 +925,13 @@ bool sinsp_parser::retrieve_enter_event(sinsp_evt *enter_evt, sinsp_evt *exit_ev
{
//ASSERT(false);
exit_evt->m_tinfo->set_lastevent_data_validity(false);
if (m_inspector->m_sinsp_stats_v2)
if (m_inspector != nullptr && m_inspector->m_sinsp_stats_v2)
{
m_inspector->m_sinsp_stats_v2->m_n_retrieve_evts_drops++;
}
return false;
}
if (m_inspector->m_sinsp_stats_v2)
if (m_inspector != nullptr && m_inspector->m_sinsp_stats_v2)
{
m_inspector->m_sinsp_stats_v2->m_n_retrieved_evts++;
}
Expand Down Expand Up @@ -3654,13 +3654,13 @@ void sinsp_parser::parse_close_exit(sinsp_evt *evt)
// It is normal when a close fails that the fd lookup failed, so we revert the
// increment of m_n_failed_fd_lookups (for the enter event too if there's one).
//
if (m_inspector->m_sinsp_stats_v2)
if (m_inspector != nullptr && m_inspector->m_sinsp_stats_v2)
{
m_inspector->m_sinsp_stats_v2->m_n_failed_fd_lookups--;
}
if(evt->m_tinfo && evt->m_tinfo->is_lastevent_data_valid())
{
if (m_inspector->m_sinsp_stats_v2)
if (m_inspector != nullptr && m_inspector->m_sinsp_stats_v2)
{
m_inspector->m_sinsp_stats_v2->m_n_failed_fd_lookups--;
}
Expand Down
2 changes: 1 addition & 1 deletion userspace/libsinsp/sinsp.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -399,6 +399,7 @@ void sinsp::open_common(scap_open_args* oargs, const struct scap_vtable* vtable,
throw scap_open_exception(error, scap_rc);
}

m_platform = platform;
scap_rc = scap_platform_init(platform, m_platform_lasterr, m_h->m_engine, oargs);
if(scap_rc != SCAP_SUCCESS)
{
Expand All @@ -409,7 +410,6 @@ void sinsp::open_common(scap_open_args* oargs, const struct scap_vtable* vtable,

throw scap_open_exception(m_platform_lasterr, scap_rc);
}
m_platform = platform;

init();

Expand Down
12 changes: 6 additions & 6 deletions userspace/libsinsp/threadinfo.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1473,7 +1473,7 @@ bool sinsp_thread_manager::add_thread(sinsp_threadinfo *threadinfo, bool from_sc
#endif
)
{
if (m_inspector->m_sinsp_stats_v2)
if (m_inspector != nullptr && m_inspector->m_sinsp_stats_v2)
{
// rate limit messages to avoid spamming the logs
if (m_inspector->m_sinsp_stats_v2->m_n_drops_full_threadtable % m_max_thread_table_size == 0)
Expand Down Expand Up @@ -1505,7 +1505,7 @@ bool sinsp_thread_manager::add_thread(sinsp_threadinfo *threadinfo, bool from_sc
tinfo_shared_ptr->compute_program_hash();
m_threadtable.put(std::move(tinfo_shared_ptr));

if (m_inspector->m_sinsp_stats_v2)
if (m_inspector != nullptr && m_inspector->m_sinsp_stats_v2)
{
m_inspector->m_sinsp_stats_v2->m_n_added_threads++;
}
Expand Down Expand Up @@ -1777,7 +1777,7 @@ void sinsp_thread_manager::remove_thread(int64_t tid)
* the cache just to be sure.
*/
m_last_tid = -1;
if (m_inspector->m_sinsp_stats_v2)
if (m_inspector != nullptr && m_inspector->m_sinsp_stats_v2)
{
m_inspector->m_sinsp_stats_v2->m_n_removed_threads++;
}
Expand Down Expand Up @@ -2112,7 +2112,7 @@ threadinfo_map_t::ptr_t sinsp_thread_manager::find_thread(int64_t tid, bool look
thr = m_last_tinfo.lock();
if (thr)
{
if (m_inspector->m_sinsp_stats_v2)
if (m_inspector != nullptr && m_inspector->m_sinsp_stats_v2)
{
m_inspector->m_sinsp_stats_v2->m_n_cached_thread_lookups++;
}
Expand All @@ -2130,7 +2130,7 @@ threadinfo_map_t::ptr_t sinsp_thread_manager::find_thread(int64_t tid, bool look

if(thr)
{
if (m_inspector->m_sinsp_stats_v2)
if (m_inspector != nullptr && m_inspector->m_sinsp_stats_v2)
{
m_inspector->m_sinsp_stats_v2->m_n_noncached_thread_lookups++;
}
Expand All @@ -2145,7 +2145,7 @@ threadinfo_map_t::ptr_t sinsp_thread_manager::find_thread(int64_t tid, bool look
}
else
{
if (m_inspector->m_sinsp_stats_v2)
if (m_inspector != nullptr && m_inspector->m_sinsp_stats_v2)
{
m_inspector->m_sinsp_stats_v2->m_n_failed_thread_lookups++;
}
Expand Down