-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(helpers): kernel_symbols change and fix #399
fix(helpers): kernel_symbols change and fix #399
Conversation
@geyslan I need this to be merged so I can add the rest of the code (a kprobe attachment function that allows specifying the offset of the symbol to attach to) and a selftest of both. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rafaeldtinoco this is an amazing work. LGTM.
I've put some comments.
// Simple file parsing (no processing) takes ~0.200 seconds on a 4-core machine. | ||
// If buffer is increased to 4MB, it might take ~0.150 seconds. A simple | ||
// mono-threaded implementation that parses + processes the lines takes ~0.700 | ||
// seconds. This approach takes ~0.350 seconds (2x speedup). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is amazing! It's a fantastic improvement. Would you mind bringing benchmark test files for both cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not doing performance tuning/benchmarking for this change (since Im actually removing one attempt to do so that caused the original problem because of wrong premises). Be my guest to open an issue and benchmark any optimization to this code =).
The whole "lazy" concept for the kallsyms file relies in the concept that symbols would have a single address only (and the assumption that the file is mostly sorted by symbol addresses). Checking kernel_symbols.go and the KernelSymbolTable interface, one could change GetSymbolByName to return a slice of []*KernelSymbols and that would be an easy change for fullKernelSystemTable. Problem is that the lazy implementation relies in stopping to read the kallsym file once a symbol is picked, or in a binary search of address considering file is sorted, etc. That doesn't work well for the same symbol having more than 1 address. Quickly checking kallsyms for duplicate symbol addresses, the generic names will have huge amount of duplicates: 1693 __func__.0 1198 _entry.1 834 __func__.2 777 _entry.3 ... and, the "unique kernel symbols" will have 2 or 3 addresses (under certain circumstances, like when the symbol is static to a source file, or under certain compilation optimizations): 2 switch_mm 2 sw_fence_dummy_notify 2 suspend_attrs 2 suspend_attr_group 2 subsystem_id_show 2 str__i915__trace_system_name ... There is also the case where the same address has multiple symbols: ffffffffc0310200 b __key.22 [drm_display_helper] ... ffffffffc0310200 b __key.17 [drm_display_helper] ffffffffc0310200 b drm_dp_aux_dev_class [drm_display_helper] ... So in both cases, when indexing by sym name, or by sym address, code should account for the possibility of having multiple results. This change makes the helper "fast enough" while allowing it to return multiple values from its maps. Related: aquasecurity/tracee#3798
This method allows the kprobe to be attached not only by symbol name (already supported) but by the kernel symbol offset as well. The offset should be read from /proc/kallsyms file before method is invoked.
Thanks for the careful review @geyslan. I think I have addressed most of your comments! I'll merge once the new push passes the tests and then focus in the tracee changes. |
The whole "lazy" concept for the kallsyms file relies in the concept that symbols would have a single address only (and the assumption that the file is mostly sorted by symbol addresses).
Checking kernel_symbols.go and the KernelSymbolTable interface, one could change GetSymbolByName to return a slice of []*KernelSymbols and that would be an easy change for fullKernelSystemTable.
Problem is that the lazy implementation relies in stopping to read the kallsym file once a symbol is picked, or in a binary search of address considering file is sorted, etc. That doesn't work well for the same symbol having more than 1 address.
Quickly checking kallsyms for duplicate symbol addresses, the generic names will have huge amount of duplicates:
1693 func.0
1198 _entry.1
834 func.2
777 _entry.3
...
and, the "unique kernel symbols" will have 2 or 3 addresses (under certain circumstances, like when the symbol is static to a source file, or under certain compilation optimizations):
2 switch_mm
2 sw_fence_dummy_notify
2 suspend_attrs
2 suspend_attr_group
2 subsystem_id_show
2 str__i915__trace_system_name
...
There is also the case where the same address has multiple symbols:
ffffffffc0310200 b __key.22 [drm_display_helper]
...
ffffffffc0310200 b __key.17 [drm_display_helper]
ffffffffc0310200 b drm_dp_aux_dev_class [drm_display_helper] ...
So in both cases, when indexing by sym name, or by sym address, code should account for the possibility of having multiple results.
This change makes the helper "fast enough" while allowing it to return multiple values from its maps.
Related: aquasecurity/tracee#3798