Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(libsinsp): safer parameter handling part 2 #1502

Merged
merged 7 commits into from
Nov 27, 2023

Conversation

LucaGuerra
Copy link
Contributor

@LucaGuerra LucaGuerra commented Nov 22, 2023

What type of PR is this?

Uncomment one (or more) /kind <> lines:

/kind cleanup

Any specific area of the project related to this PR?

Uncomment one (or more) /area <> lines:

/area libsinsp

Does this PR require a change in the driver versions?

No

What this PR does / why we need it:

This PR continues an effort to clean up libsinsp from potential undefined behavior (see: #1470 ). It implements the following refactoring changes:

  • Added some const correctness around the code that handles sinsp_evt_params. When some raw parameter data is read from the driver it is never modified and so it should be const when processed.
  • Modified the sinsp_evt class and sinsp_evt_param. Now sinsp_evt_param has more reasons to exist because it is no longer simply a "buffer + length" wrapper but it is able to convert values to appropriate types (with the as() method) and also point back to the original event
  • Now we can use std::string_view in the template parameter to read string buffers
  • All uses of get_param(n)<T> are updated to get_param(n)->as<T>
  • The function get_param_as_str is updated to remove some UB according to the refactoring above

cc @FedeDP @gnosek

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

NONE

Copy link
Contributor

@gnosek gnosek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Words can't express how beautiful that is (even if slightly more verbose). Just a few minor comments

}

size_t string_len = strnlen(param.m_val, param.m_len);
if (param.m_len == string_len)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you mean != here or is this e.g. a test for the NUL terminator? If so, we could use a comment why the difference compared to the generic version.

(but also, I don't think string_view requires NUL termination, it would defeat the purpose)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is indeed a NUL terminator check. Sometimes the drivers will output a NUL terminated string and I feel we need a type for that, and that's what I'm using std::string_view for (i.e., this will interpret one parameter that contains a NUL-terminated string).

The resulting string view will not contain the null terminator within its bounds but if you call .data() it will return a NUL-terminated const char*. We wouldn't have this additional requirement if we were using std::string around libsinsp, but many functions right now do ask for a const char*.

This works, but does not feel ideal, I'm open to suggestions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we're technically doing an out of bounds read when treating the string_view as a NUL-terminated string (but it's fine because we know there's a NUL just outside), right?

I'm fine with this (this-is-fine.jpg) but:

  • please add a comment about it
  • I'd explicitly check param.m_val[param.m_len] == 0 instead (feels more direct) but that does change the behavior for NUL-separated string arrays (which we use in a few places) -- we'd start accepting them as string_views. I assume we don't want them as string views but I could also use a comment to that effect :)

Copy link
Contributor Author

@LucaGuerra LucaGuerra Nov 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doing an out of bounds read

Indeed we read within the event but outside the string_view which we guarantee to be NUL-terminated just outside. If we were using a NUL-terminated string_view we would have troubles converting to std::string because the terminator would be added to the string itself. Unless you can think of a way of having both right at the same time. The other alternative is to just copy the string.

Anyways I agree with you that the right way of fixing it is to avoid const char* as much as possible and use strings instead.

I have clarified the comment and changed the error condition, I think we simply expect the string to occupy the entire value except for the null terminator, all other cases should be handled differently.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, no, I do not advocate for NUL-terminated string views, using the NUL just outside it is a great hack in the circumstances we're in ;)

👍


/* If the pathname is `<NA>` here we shouldn't have problems during `parse_dirfd`.
* It doesn't start with "/" so it is not considered an absolute path.
*/
std::string sdir;
parse_dirfd(evt, pathname, dirfd, &sdir);
parse_dirfd(evt, pathname.data(), dirfd, &sdir);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...but for example this call might need a NUL (same for any other use of .data())

way out of scope but I'd love to see more string_views, fewer explicit const char*, size_t pairs (or even worse, raw char*s)

@@ -2925,3 +2863,20 @@ std::optional<std::reference_wrapper<std::string>> sinsp_evt::get_enter_evt_para

return ret;
}

std::string sinsp_evt_param::invalid_len_error(size_t requested_length) const
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for gcc we should be able to add an __attribute((cold)) to prevent inlining in hot paths

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have done some more experimentation. I know this is not pretty but by moving everything including the throw to a separate method we actually get the desired result (inlined _as() function with a call to the exception method, without that sometimes it would not be inlined). I also added the cold attribute although I'm not completely sure if it changes anything.

while(true)
{
uint32_t blen = binary_buffer_to_string(&m_paramstr_storage[0],
payload,
param->m_val,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could consider reintroducing the payload and payload_len variables to reduce the size of the diff, but I'm fine either way

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm good with param->m_val because it feels more like an "immutable" variable coming from somewhere else rather than something you can do clever tricks with (like payload was before)... if this makes sense 😅


return ret;
}
const sinsp_evt_param* get_param(uint32_t id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the vast majority of cases, id is a compile time constant. not sure whether making it a template parameter would improve code generation but maybe it would be worth a try if you can easily test this (run size on a binary before and after the change)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also not sure, but I'm ok with keeping all accesses the same (without the compile time distinction, maybe if we want to experiment a constexpr could be enough for some cases?) for now so we can move on with continuing the saga

userspace/libsinsp/event.cpp Outdated Show resolved Hide resolved
@LucaGuerra LucaGuerra force-pushed the refactor-param-get-part-2 branch from bd64fbd to c187c55 Compare November 23, 2023 09:30
@LucaGuerra
Copy link
Contributor Author

/milestone 0.14.0

@poiana poiana added this to the 0.14.0 milestone Nov 23, 2023
@LucaGuerra LucaGuerra force-pushed the refactor-param-get-part-2 branch 2 times, most recently from fe238ce to 29dbd45 Compare November 24, 2023 11:23
Copy link
Contributor

@FedeDP FedeDP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really love this! Thank you very much Luca!
I only left a question, but this LGTM!

@LucaGuerra LucaGuerra force-pushed the refactor-param-get-part-2 branch from 64a3075 to 166887f Compare November 24, 2023 15:42
@FedeDP
Copy link
Contributor

FedeDP commented Nov 24, 2023

Uh conflict :) then i'll approve!

@LucaGuerra LucaGuerra force-pushed the refactor-param-get-part-2 branch from 166887f to 0f16eb2 Compare November 24, 2023 16:26
@LucaGuerra
Copy link
Contributor Author

rebased!

FedeDP
FedeDP previously approved these changes Nov 24, 2023
Copy link
Contributor

@FedeDP FedeDP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@poiana
Copy link
Contributor

poiana commented Nov 24, 2023

LGTM label has been added.

Git tree hash: 1cae1a479602401684ef7ed34599c63d5745a833

@LucaGuerra
Copy link
Contributor Author

Ah, need to fix a couple of const. I had rebased but not tried the build 😂
image

... will be back.

@LucaGuerra
Copy link
Contributor Author

ok, fixed!

Copy link
Contributor

@FedeDP FedeDP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@poiana poiana added the lgtm label Nov 24, 2023
@poiana
Copy link
Contributor

poiana commented Nov 24, 2023

LGTM label has been added.

Git tree hash: 93b5501f61951077d13dcd2065c8a34696d83ea6

Copy link
Contributor

@gnosek gnosek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SHIP IT

@poiana
Copy link
Contributor

poiana commented Nov 27, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: FedeDP, gnosek, LucaGuerra

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [FedeDP,LucaGuerra,gnosek]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@poiana poiana merged commit a894e67 into falcosecurity:master Nov 27, 2023
@LucaGuerra LucaGuerra deleted the refactor-param-get-part-2 branch November 27, 2023 17:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants