fix(helpers): kernel_symbols change and fix #399

rafaeldtinoco · 2024-01-11T11:06:53Z

The whole "lazy" concept for the kallsyms file relies in the concept that symbols would have a single address only (and the assumption that the file is mostly sorted by symbol addresses).

Checking kernel_symbols.go and the KernelSymbolTable interface, one could change GetSymbolByName to return a slice of []*KernelSymbols and that would be an easy change for fullKernelSystemTable.

Problem is that the lazy implementation relies in stopping to read the kallsym file once a symbol is picked, or in a binary search of address considering file is sorted, etc. That doesn't work well for the same symbol having more than 1 address.

Quickly checking kallsyms for duplicate symbol addresses, the generic names will have huge amount of duplicates:

1693 func.0
1198 _entry.1
834 func.2
777 _entry.3
...

and, the "unique kernel symbols" will have 2 or 3 addresses (under certain circumstances, like when the symbol is static to a source file, or under certain compilation optimizations):

2 switch_mm
2 sw_fence_dummy_notify
2 suspend_attrs
2 suspend_attr_group
2 subsystem_id_show
2 str__i915__trace_system_name
...

There is also the case where the same address has multiple symbols:

ffffffffc0310200 b __key.22 [drm_display_helper]
...
ffffffffc0310200 b __key.17 [drm_display_helper]
ffffffffc0310200 b drm_dp_aux_dev_class [drm_display_helper] ...

So in both cases, when indexing by sym name, or by sym address, code should account for the possibility of having multiple results.

This change makes the helper "fast enough" while allowing it to return multiple values from its maps.

Related: aquasecurity/tracee#3798

rafaeldtinoco · 2024-01-11T13:47:10Z

@geyslan I need this to be merged so I can add the rest of the code (a kprobe attachment function that allows specifying the offset of the symbol to attach to) and a selftest of both.

geyslan

@rafaeldtinoco this is an amazing work. LGTM.

I've put some comments.

helpers/kernel_symbols.go

libbpfgo.c

helpers/kernel_symbols.go

geyslan · 2024-01-11T19:51:54Z

helpers/kernel_symbols.go

+// Simple file parsing (no processing) takes ~0.200 seconds on a 4-core machine.
+// If buffer is increased to 4MB, it might take ~0.150 seconds. A simple
+// mono-threaded implementation that parses + processes the lines takes ~0.700
+// seconds. This approach takes ~0.350 seconds (2x speedup).


This is amazing! It's a fantastic improvement. Would you mind bringing benchmark test files for both cases?

I'm not doing performance tuning/benchmarking for this change (since Im actually removing one attempt to do so that caused the original problem because of wrong premises). Be my guest to open an issue and benchmark any optimization to this code =).

prog.go

The whole "lazy" concept for the kallsyms file relies in the concept that symbols would have a single address only (and the assumption that the file is mostly sorted by symbol addresses). Checking kernel_symbols.go and the KernelSymbolTable interface, one could change GetSymbolByName to return a slice of []*KernelSymbols and that would be an easy change for fullKernelSystemTable. Problem is that the lazy implementation relies in stopping to read the kallsym file once a symbol is picked, or in a binary search of address considering file is sorted, etc. That doesn't work well for the same symbol having more than 1 address. Quickly checking kallsyms for duplicate symbol addresses, the generic names will have huge amount of duplicates: 1693 __func__.0 1198 _entry.1 834 __func__.2 777 _entry.3 ... and, the "unique kernel symbols" will have 2 or 3 addresses (under certain circumstances, like when the symbol is static to a source file, or under certain compilation optimizations): 2 switch_mm 2 sw_fence_dummy_notify 2 suspend_attrs 2 suspend_attr_group 2 subsystem_id_show 2 str__i915__trace_system_name ... There is also the case where the same address has multiple symbols: ffffffffc0310200 b __key.22 [drm_display_helper] ... ffffffffc0310200 b __key.17 [drm_display_helper] ffffffffc0310200 b drm_dp_aux_dev_class [drm_display_helper] ... So in both cases, when indexing by sym name, or by sym address, code should account for the possibility of having multiple results. This change makes the helper "fast enough" while allowing it to return multiple values from its maps. Related: aquasecurity/tracee#3798

This method allows the kprobe to be attached not only by symbol name (already supported) but by the kernel symbol offset as well. The offset should be read from /proc/kallsyms file before method is invoked.

rafaeldtinoco · 2024-01-11T21:57:08Z

Thanks for the careful review @geyslan. I think I have addressed most of your comments! I'll merge once the new push passes the tests and then focus in the tracee changes.

Fixes: #3798 Related: aquasecurity/libbpfgo#399

Fixes: aquasecurity#3798 Related: aquasecurity/libbpfgo#399

rafaeldtinoco requested a review from geyslan January 11, 2024 11:07

This was referenced Jan 11, 2024

KernelSymbolTable helper is conceptually broken aquasecurity/tracee#3798

Closed

Cannot create probe on symbols with duplicate entries in kallsyms (multiple addresses) in recent kernels aquasecurity/tracee#3653

Closed

geyslan previously approved these changes Jan 11, 2024

View reviewed changes

rafaeldtinoco added 3 commits January 11, 2024 18:54

feature: add AttachKprobeOffset method to bpf programs.

a879333

This method allows the kprobe to be attached not only by symbol name (already supported) but by the kernel symbol offset as well. The offset should be read from /proc/kallsyms file before method is invoked.

selftest: add selftest for AttachKprobeOffset

1f069b9

rafaeldtinoco dismissed geyslan’s stale review via 1f069b9 January 11, 2024 21:56

rafaeldtinoco merged commit 90dbfff into aquasecurity:main Jan 11, 2024
14 checks passed

rafaeldtinoco deleted the ksymbolschange branch January 11, 2024 22:02

rafaeldtinoco mentioned this pull request Jan 11, 2024

Fix symbol multi addrs aquasecurity/tracee#3802

Merged

rafaeldtinoco added a commit to aquasecurity/tracee that referenced this pull request Jan 18, 2024

fix(utils): fix kallsyms package for multi address symbols

f3f0692

Fixes: #3798 Related: aquasecurity/libbpfgo#399

yanivagman pushed a commit to yanivagman/tracee that referenced this pull request Jun 24, 2024

fix(utils): fix kallsyms package for multi address symbols

e806baf

Fixes: aquasecurity#3798 Related: aquasecurity/libbpfgo#399

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(helpers): kernel_symbols change and fix #399

fix(helpers): kernel_symbols change and fix #399

rafaeldtinoco commented Jan 11, 2024

rafaeldtinoco commented Jan 11, 2024

geyslan left a comment

geyslan Jan 11, 2024

rafaeldtinoco Jan 11, 2024

rafaeldtinoco commented Jan 11, 2024

fix(helpers): kernel_symbols change and fix #399

fix(helpers): kernel_symbols change and fix #399

Conversation

rafaeldtinoco commented Jan 11, 2024

rafaeldtinoco commented Jan 11, 2024

geyslan left a comment

Choose a reason for hiding this comment

geyslan Jan 11, 2024

Choose a reason for hiding this comment

rafaeldtinoco Jan 11, 2024

Choose a reason for hiding this comment

rafaeldtinoco commented Jan 11, 2024