Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(helpers): kernel_symbols change and fix #399

Merged
merged 3 commits into from
Jan 11, 2024
Merged

fix(helpers): kernel_symbols change and fix #399

merged 3 commits into from
Jan 11, 2024

Conversation

rafaeldtinoco
Copy link
Contributor

The whole "lazy" concept for the kallsyms file relies in the concept that symbols would have a single address only (and the assumption that the file is mostly sorted by symbol addresses).

Checking kernel_symbols.go and the KernelSymbolTable interface, one could change GetSymbolByName to return a slice of []*KernelSymbols and that would be an easy change for fullKernelSystemTable.

Problem is that the lazy implementation relies in stopping to read the kallsym file once a symbol is picked, or in a binary search of address considering file is sorted, etc. That doesn't work well for the same symbol having more than 1 address.

Quickly checking kallsyms for duplicate symbol addresses, the generic names will have huge amount of duplicates:

1693 func.0
1198 _entry.1
834 func.2
777 _entry.3
...

and, the "unique kernel symbols" will have 2 or 3 addresses (under certain circumstances, like when the symbol is static to a source file, or under certain compilation optimizations):

2 switch_mm
2 sw_fence_dummy_notify
2 suspend_attrs
2 suspend_attr_group
2 subsystem_id_show
2 str__i915__trace_system_name
...

There is also the case where the same address has multiple symbols:

ffffffffc0310200 b __key.22 [drm_display_helper]
...
ffffffffc0310200 b __key.17 [drm_display_helper]
ffffffffc0310200 b drm_dp_aux_dev_class [drm_display_helper] ...

So in both cases, when indexing by sym name, or by sym address, code should account for the possibility of having multiple results.

This change makes the helper "fast enough" while allowing it to return multiple values from its maps.

Related: aquasecurity/tracee#3798

@rafaeldtinoco rafaeldtinoco requested a review from geyslan January 11, 2024 11:07
@rafaeldtinoco
Copy link
Contributor Author

@geyslan I need this to be merged so I can add the rest of the code (a kprobe attachment function that allows specifying the offset of the symbol to attach to) and a selftest of both.

geyslan
geyslan previously approved these changes Jan 11, 2024
Copy link
Member

@geyslan geyslan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rafaeldtinoco this is an amazing work. LGTM.

I've put some comments.

helpers/kernel_symbols.go Outdated Show resolved Hide resolved
libbpfgo.c Outdated Show resolved Hide resolved
helpers/kernel_symbols.go Outdated Show resolved Hide resolved
helpers/kernel_symbols.go Show resolved Hide resolved
helpers/kernel_symbols.go Show resolved Hide resolved
helpers/kernel_symbols.go Outdated Show resolved Hide resolved
helpers/kernel_symbols.go Show resolved Hide resolved
Comment on lines +136 to +139
// Simple file parsing (no processing) takes ~0.200 seconds on a 4-core machine.
// If buffer is increased to 4MB, it might take ~0.150 seconds. A simple
// mono-threaded implementation that parses + processes the lines takes ~0.700
// seconds. This approach takes ~0.350 seconds (2x speedup).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is amazing! It's a fantastic improvement. Would you mind bringing benchmark test files for both cases?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not doing performance tuning/benchmarking for this change (since Im actually removing one attempt to do so that caused the original problem because of wrong premises). Be my guest to open an issue and benchmark any optimization to this code =).

prog.go Outdated Show resolved Hide resolved
The whole "lazy" concept for the kallsyms file relies in the concept that
symbols would have a single address only (and the assumption that the file is
mostly sorted by symbol addresses).

Checking kernel_symbols.go and the KernelSymbolTable interface, one could
change GetSymbolByName to return a slice of []*KernelSymbols and that would be
an easy change for fullKernelSystemTable.

Problem is that the lazy implementation relies in stopping to read the kallsym
file once a symbol is picked, or in a binary search of address considering file
is sorted, etc. That doesn't work well for the same symbol having more than 1
address.

Quickly checking kallsyms for duplicate symbol addresses, the generic names
will have huge amount of duplicates:

1693 __func__.0
1198 _entry.1
834 __func__.2
777 _entry.3
...

and, the "unique kernel symbols" will have 2 or 3 addresses (under certain
circumstances, like when the symbol is static to a source file, or under
certain compilation optimizations):

2 switch_mm
2 sw_fence_dummy_notify
2 suspend_attrs
2 suspend_attr_group
2 subsystem_id_show
2 str__i915__trace_system_name
...

There is also the case where the same address has multiple symbols:

ffffffffc0310200 b __key.22	[drm_display_helper]
...
ffffffffc0310200 b __key.17	[drm_display_helper]
ffffffffc0310200 b drm_dp_aux_dev_class	[drm_display_helper]
...

So in both cases, when indexing by sym name, or by sym address, code should
account for the possibility of having multiple results.

This change makes the helper "fast enough" while allowing it to return multiple
values from its maps.

Related: aquasecurity/tracee#3798
This method allows the kprobe to be attached not only by symbol name (already
supported) but by the kernel symbol offset as well. The offset should be read
from /proc/kallsyms file before method is invoked.
@rafaeldtinoco
Copy link
Contributor Author

Thanks for the careful review @geyslan. I think I have addressed most of your comments! I'll merge once the new push passes the tests and then focus in the tracee changes.

@rafaeldtinoco rafaeldtinoco merged commit 90dbfff into aquasecurity:main Jan 11, 2024
14 checks passed
@rafaeldtinoco rafaeldtinoco deleted the ksymbolschange branch January 11, 2024 22:02
rafaeldtinoco added a commit to aquasecurity/tracee that referenced this pull request Jan 18, 2024
yanivagman pushed a commit to yanivagman/tracee that referenced this pull request Jun 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants