wip: new(driver/modern_bpf): home-made bpf_loop for sendmmsg and recvmmsg. #2233

FedeDP · 2025-01-15T08:07:45Z

What type of PR is this?

/kind feature

Any specific area of the project related to this PR?

/area driver-modern-bpf

Does this PR require a change in the driver versions?

What this PR does / why we need it:

For sendmmsg and recvmmsg in modern_bpf probe we could not use bpf_loop helper because it caused verifier issues on kernels prior to 5.13 (see #2027 (comment)).
Therefore we used a loop up to 16; then we noticed that 16 was too high, thus we lowered the limit to 8 (600fefb).
This means we can only read first 8 messages sent through sendmmsg and recvmmsg.

This PR's scope is to increase the limit up to 256 (in the first proposed draft, i limit it to 64).
The idea is to build an X_MACRO that let us easily chain tail-calls.
EBPF tail call limit is MAX_TAIL_CALL_CNT, ie 32 on older kernels, and 33 nowadays. For now, i capped the implementation to 8 (and each tail-call loops 8 times, this 64 total).

The good: we support up to 8x32=256messages (that is a far better situation than now)
The bad: we need to extract network args at each iteration because we can't share state between tail calls
The ugly: well, the code gets a bit convoluted but it does the trick. It's just plain old C anyway :)

Note also that to make it a little bit less verbose, i could also create a new header with all the macros and share them between the 2 source files, since they are basically identical (naming aside). I decided to keep it as easy as possible, but it is a possibility (and it would be more future proof since we would only have a single place to update if needed).

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Putting it in wip to allow a discussion.
THIS IS NOT FOR 0.20.0.

Does this PR introduce a user-facing change?:

NONE

FedeDP · 2025-01-15T08:07:56Z

/milestone TBD

poiana · 2025-01-15T08:08:03Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: FedeDP

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~driver/OWNERS~~ [FedeDP]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

FedeDP · 2025-01-15T08:08:36Z

I want to hear @Andreagit97 and @Molter73 opinions on this one :)

github-actions · 2025-01-15T08:14:11Z

Perf diff from master - unit tests

Warning:
Processed 36803 events and lost 1 chunks!

Check IO/CPU overload!

    11.27%     -0.69%  [.] sinsp::next
     1.53%     +0.52%  [.] next
     1.75%     -0.41%  [.] libsinsp::sinsp_suppress::process_event
     5.75%     -0.26%  [.] next_event_from_file
     0.47%     +0.26%  [.] std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*>

Heap diff from master - unit tests

peak heap memory consumption: 0B
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B

Heap diff from master - scap file

peak heap memory consumption: 0B
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B

Benchmarks diff from master

Comparing gbench_data.json to /root/actions-runner/_work/libs/libs/build/gbench_data.json
Benchmark                                                         Time             CPU      Time Old      Time New       CPU Old       CPU New
----------------------------------------------------------------------------------------------------------------------------------------------
BM_sinsp_split_mean                                            +0.0125         +0.0125           146           148           146           148
BM_sinsp_split_median                                          +0.0168         +0.0168           146           148           146           148
BM_sinsp_split_stddev                                          +0.7774         +0.7748             1             1             1             1
BM_sinsp_split_cv                                              +0.7555         +0.7529             0             0             0             0
BM_sinsp_concatenate_paths_relative_path_mean                  -0.0333         -0.0332            61            59            61            59
BM_sinsp_concatenate_paths_relative_path_median                -0.0336         -0.0336            61            59            61            59
BM_sinsp_concatenate_paths_relative_path_stddev                +0.5974         +0.5989             1             1             1             1
BM_sinsp_concatenate_paths_relative_path_cv                    +0.6523         +0.6539             0             0             0             0
BM_sinsp_concatenate_paths_empty_path_mean                     -0.0452         -0.0452            25            24            25            24
BM_sinsp_concatenate_paths_empty_path_median                   -0.0437         -0.0437            25            24            25            24
BM_sinsp_concatenate_paths_empty_path_stddev                   -0.6335         -0.6332             0             0             0             0
BM_sinsp_concatenate_paths_empty_path_cv                       -0.6162         -0.6158             0             0             0             0
BM_sinsp_concatenate_paths_absolute_path_mean                  -0.0114         -0.0114            64            63            64            63
BM_sinsp_concatenate_paths_absolute_path_median                -0.0136         -0.0136            64            63            64            63
BM_sinsp_concatenate_paths_absolute_path_stddev                +0.8003         +0.8003             0             1             0             1
BM_sinsp_concatenate_paths_absolute_path_cv                    +0.8210         +0.8210             0             0             0             0
BM_sinsp_split_container_image_mean                            -0.0052         -0.0051           390           388           390           388
BM_sinsp_split_container_image_median                          -0.0034         -0.0034           389           388           389           388
BM_sinsp_split_container_image_stddev                          -0.1510         -0.1514             3             3             3             3
BM_sinsp_split_container_image_cv                              -0.1466         -0.1470             0             0             0             0

codecov · 2025-01-15T08:19:27Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 75.09%. Comparing base (8362ae9) to head (8e15871).
Report is 5 commits behind head on master.

Additional details and impacted files

@@           Coverage Diff           @@
##           master    #2233   +/-   ##
=======================================
  Coverage   75.09%   75.09%           
=======================================
  Files         276      276           
  Lines       34391    34391           
  Branches     5927     5927           
=======================================
  Hits        25826    25826           
  Misses       8565     8565

Flag	Coverage Δ
libsinsp	`75.09% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

github-actions · 2025-01-15T09:21:53Z

X64 kernel testing matrix

KERNEL	CMAKE-CONFIGURE	KMOD BUILD	KMOD SCAP-OPEN	BPF-PROBE BUILD	BPF-PROBE SCAP-OPEN	MODERN-BPF SCAP-OPEN
amazonlinux2-4.19	🟢	🟢	🟢	🟢	🟢	🟡
amazonlinux2-5.10	🟢	🟢	🟢	🟢	🟢	🟢
amazonlinux2-5.15	🟢	🟢	🟢	🟢	🟢	🟢
amazonlinux2-5.4	🟢	🟢	🟢	🟢	🟢	🟡
amazonlinux2022-5.15	🟢	🟢	🟢	🟢	🟢	🟢
amazonlinux2023-6.1	🟢	🟢	🟢	🟢	🟢	🟢
archlinux-6.0	🟢	🟢	🟢	🟢	🟢	🟢
archlinux-6.7	🟢	🟢	🟢	🟢	🟢	🟢
centos-3.10	🟢	🟢	🟢	🟡	🟡	🟡
centos-4.18	🟢	🟢	🟢	🟢	🟢	🟢
centos-5.14	🟢	🟢	🟢	🟢	🟢	🟢
fedora-5.17	🟢	🟢	🟢	🟢	🟢	🟢
fedora-5.8	🟢	🟢	🟢	🟢	🟢	🟢
fedora-6.2	🟢	🟢	🟢	🟢	🟢	🟢
oraclelinux-3.10	🟢	🟢	🟢	🟡	🟡	🟡
oraclelinux-4.14	🟢	🟢	🟢	🟢	🟢	🟡
oraclelinux-5.15	🟢	🟢	🟢	🟢	🟢	🟢
oraclelinux-5.4	🟢	🟢	🟢	🟢	🟢	🟡
ubuntu-4.15	🟢	🟢	🟢	🟢	🟢	🟡
ubuntu-5.8	🟢	🟢	🟢	🟢	🟢	🟡
ubuntu-6.5	🟢	🟢	🟢	🟢	🟢	🟢

ARM64 kernel testing matrix

KERNEL	CMAKE-CONFIGURE	KMOD BUILD	KMOD SCAP-OPEN	BPF-PROBE BUILD	BPF-PROBE SCAP-OPEN	MODERN-BPF SCAP-OPEN
amazonlinux2-5.4	🟢	🟢	🟢	🟢	🟢	🟡
amazonlinux2022-5.15	🟢	🟢	🟢	🟢	🟢	🟢
fedora-6.2	🟢	🟢	🟢	🟢	🟢	🟢
oraclelinux-4.14	🟢	🟢	🟢	🟡	🟡	🟡
oraclelinux-5.15	🟢	🟢	🟢	🟢	🟢	🟢
ubuntu-6.5	🟢	🟢	🟢	🟢	🟢	🟢

FedeDP · 2025-01-15T09:22:58Z

CI Build / run-e2e-tests-amd64 (bundled_deps) (pull_request) Failing after 21m

Need to understand what broke e2e-tests.

With this one weird trick, bpf hates us! Signed-off-by: Federico Di Pierro <[email protected]>

Andreagit97

The approach seems great! thank you! if we want to avoid macro X we can just use a single ebpf program in tail call and save the state in a PER-CPU map/array so that each tailed called prog can see the iteration number. In the end, it should change almost nothing, it would be probably just easier to debug

FedeDP · 2025-01-15T15:58:25Z

If we create a per-cpu map, we could also avoid to share recvmmsg_data_t structure on each tail called program, avoiding unneeded extract__network_args calls.
I will look into that; for a first draft i wanted to avoid any additional map and see how it went.

poiana added release-note-none do-not-merge/work-in-progress kind/feature New feature or request dco-signoff: yes area/driver-modern-bpf labels Jan 15, 2025

poiana added this to the TBD milestone Jan 15, 2025

poiana added the size/L label Jan 15, 2025

poiana requested review from hbrueckner and leogr January 15, 2025 08:08

poiana added the approved label Jan 15, 2025

FedeDP force-pushed the new/homemade_bpf_loop branch from dd30172 to 5664af4 Compare January 15, 2025 11:00

new(driver/modern_bpf): home-made bpf_loop for sendmmsg and recvmmsg.

8e15871

With this one weird trick, bpf hates us! Signed-off-by: Federico Di Pierro <[email protected]>

FedeDP force-pushed the new/homemade_bpf_loop branch from 5664af4 to 8e15871 Compare January 15, 2025 11:03

Andreagit97 reviewed Jan 15, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

wip: new(driver/modern_bpf): home-made bpf_loop for sendmmsg and recvmmsg. #2233

wip: new(driver/modern_bpf): home-made bpf_loop for sendmmsg and recvmmsg. #2233

FedeDP commented Jan 15, 2025 •

edited

Loading

FedeDP commented Jan 15, 2025

poiana commented Jan 15, 2025

FedeDP commented Jan 15, 2025

github-actions bot commented Jan 15, 2025 •

edited

Loading

codecov bot commented Jan 15, 2025 •

edited

Loading

github-actions bot commented Jan 15, 2025 •

edited

Loading

FedeDP commented Jan 15, 2025 •

edited

Loading

Andreagit97 left a comment

FedeDP commented Jan 15, 2025

wip: new(driver/modern_bpf): home-made bpf_loop for sendmmsg and recvmmsg. #2233

Are you sure you want to change the base?

wip: new(driver/modern_bpf): home-made bpf_loop for sendmmsg and recvmmsg. #2233

Conversation

FedeDP commented Jan 15, 2025 • edited Loading

FedeDP commented Jan 15, 2025

poiana commented Jan 15, 2025

FedeDP commented Jan 15, 2025

github-actions bot commented Jan 15, 2025 • edited Loading

Perf diff from master - unit tests

Heap diff from master - unit tests

Heap diff from master - scap file

Benchmarks diff from master

codecov bot commented Jan 15, 2025 • edited Loading

Codecov Report

github-actions bot commented Jan 15, 2025 • edited Loading

X64 kernel testing matrix

ARM64 kernel testing matrix

FedeDP commented Jan 15, 2025 • edited Loading

Andreagit97 left a comment

Choose a reason for hiding this comment

FedeDP commented Jan 15, 2025

FedeDP commented Jan 15, 2025 •

edited

Loading

github-actions bot commented Jan 15, 2025 •

edited

Loading

codecov bot commented Jan 15, 2025 •

edited

Loading

github-actions bot commented Jan 15, 2025 •

edited

Loading

FedeDP commented Jan 15, 2025 •

edited

Loading