-
Notifications
You must be signed in to change notification settings - Fork 171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(libsinsp): safer parameter handling part 2 #1502
refactor(libsinsp): safer parameter handling part 2 #1502
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Words can't express how beautiful that is (even if slightly more verbose). Just a few minor comments
userspace/libsinsp/event.h
Outdated
} | ||
|
||
size_t string_len = strnlen(param.m_val, param.m_len); | ||
if (param.m_len == string_len) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you mean !=
here or is this e.g. a test for the NUL terminator? If so, we could use a comment why the difference compared to the generic version.
(but also, I don't think string_view requires NUL termination, it would defeat the purpose)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is indeed a NUL terminator check. Sometimes the drivers will output a NUL terminated string and I feel we need a type for that, and that's what I'm using std::string_view
for (i.e., this will interpret one parameter that contains a NUL-terminated string).
The resulting string view will not contain the null terminator within its bounds but if you call .data()
it will return a NUL-terminated const char*
. We wouldn't have this additional requirement if we were using std::string
around libsinsp, but many functions right now do ask for a const char*
.
This works, but does not feel ideal, I'm open to suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we're technically doing an out of bounds read when treating the string_view as a NUL-terminated string (but it's fine because we know there's a NUL just outside), right?
I'm fine with this (this-is-fine.jpg) but:
- please add a comment about it
- I'd explicitly check
param.m_val[param.m_len] == 0
instead (feels more direct) but that does change the behavior for NUL-separated string arrays (which we use in a few places) -- we'd start accepting them as string_views. I assume we don't want them as string views but I could also use a comment to that effect :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doing an out of bounds read
Indeed we read within the event but outside the string_view
which we guarantee to be NUL-terminated just outside. If we were using a NUL-terminated string_view
we would have troubles converting to std::string
because the terminator would be added to the string itself. Unless you can think of a way of having both right at the same time. The other alternative is to just copy the string.
Anyways I agree with you that the right way of fixing it is to avoid const char*
as much as possible and use strings instead.
I have clarified the comment and changed the error condition, I think we simply expect the string to occupy the entire value except for the null terminator, all other cases should be handled differently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, no, I do not advocate for NUL-terminated string views, using the NUL just outside it is a great hack in the circumstances we're in ;)
👍
|
||
/* If the pathname is `<NA>` here we shouldn't have problems during `parse_dirfd`. | ||
* It doesn't start with "/" so it is not considered an absolute path. | ||
*/ | ||
std::string sdir; | ||
parse_dirfd(evt, pathname, dirfd, &sdir); | ||
parse_dirfd(evt, pathname.data(), dirfd, &sdir); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...but for example this call might need a NUL (same for any other use of .data())
way out of scope but I'd love to see more string_views, fewer explicit const char*, size_t pairs (or even worse, raw char*s)
userspace/libsinsp/event.cpp
Outdated
@@ -2925,3 +2863,20 @@ std::optional<std::reference_wrapper<std::string>> sinsp_evt::get_enter_evt_para | |||
|
|||
return ret; | |||
} | |||
|
|||
std::string sinsp_evt_param::invalid_len_error(size_t requested_length) const |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for gcc we should be able to add an __attribute((cold))
to prevent inlining in hot paths
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have done some more experimentation. I know this is not pretty but by moving everything including the throw
to a separate method we actually get the desired result (inlined _as()
function with a call to the exception method, without that sometimes it would not be inlined). I also added the cold
attribute although I'm not completely sure if it changes anything.
while(true) | ||
{ | ||
uint32_t blen = binary_buffer_to_string(&m_paramstr_storage[0], | ||
payload, | ||
param->m_val, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could consider reintroducing the payload and payload_len variables to reduce the size of the diff, but I'm fine either way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm good with param->m_val
because it feels more like an "immutable" variable coming from somewhere else rather than something you can do clever tricks with (like payload
was before)... if this makes sense 😅
|
||
return ret; | ||
} | ||
const sinsp_evt_param* get_param(uint32_t id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the vast majority of cases, id
is a compile time constant. not sure whether making it a template parameter would improve code generation but maybe it would be worth a try if you can easily test this (run size
on a binary before and after the change)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also not sure, but I'm ok with keeping all accesses the same (without the compile time distinction, maybe if we want to experiment a constexpr
could be enough for some cases?) for now so we can move on with continuing the saga
bd64fbd
to
c187c55
Compare
/milestone 0.14.0 |
fe238ce
to
29dbd45
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really love this! Thank you very much Luca!
I only left a question, but this LGTM!
64a3075
to
166887f
Compare
Uh conflict :) then i'll approve! |
Signed-off-by: Luca Guerra <[email protected]>
Signed-off-by: Luca Guerra <[email protected]>
166887f
to
0f16eb2
Compare
rebased! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
LGTM label has been added. Git tree hash: 1cae1a479602401684ef7ed34599c63d5745a833
|
Signed-off-by: Luca Guerra <[email protected]>
Signed-off-by: Luca Guerra <[email protected]>
Signed-off-by: Luca Guerra <[email protected]>
… inlining Signed-off-by: Luca Guerra <[email protected]>
Signed-off-by: Luca Guerra <[email protected]>
0f16eb2
to
5e155ed
Compare
ok, fixed! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
LGTM label has been added. Git tree hash: 93b5501f61951077d13dcd2065c8a34696d83ea6
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SHIP IT
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: FedeDP, gnosek, LucaGuerra The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
/kind cleanup
Any specific area of the project related to this PR?
/area libsinsp
Does this PR require a change in the driver versions?
No
What this PR does / why we need it:
This PR continues an effort to clean up libsinsp from potential undefined behavior (see: #1470 ). It implements the following refactoring changes:
sinsp_evt_param
s. When some raw parameter data is read from the driver it is never modified and so it should beconst
when processed.sinsp_evt
class andsinsp_evt_param
. Nowsinsp_evt_param
has more reasons to exist because it is no longer simply a "buffer + length" wrapper but it is able to convert values to appropriate types (with theas()
method) and also point back to the original eventstd::string_view
in the template parameter to read string buffersget_param(n)<T>
are updated toget_param(n)->as<T>
get_param_as_str
is updated to remove some UB according to the refactoring abovecc @FedeDP @gnosek
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?: