Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flat buffers #416

Open
wants to merge 45 commits into
base: master
Choose a base branch
from
Open

Flat buffers #416

wants to merge 45 commits into from

Conversation

boocmp
Copy link
Collaborator

@boocmp boocmp commented Jan 21, 2025

This pull request marks the first stage of optimizing the ad blocker’s memory usage. It introduces the ability to store network request blocking rules in FlatBuffers. At this stage, memory consumption is reduced by half (a 2x improvement) with the same lookup performance. Future enhancements will focus on optimizations like eliminating HashMaps, aiming to achieve a 3x overall improvement in memory efficiency.

  1. The test code has been moved to the test/unit/ folder, replicating the src directory structure. In my opinion, this improves the efficiency of working with the codebase.
  1. Rust formatter. The formatter has been applied to all files to ensure consistent formatting throughout the codebase.
  1. The filter parser code has been moved to a separate file, abstract_network.rs. This reduces the amount of code in the network.rs file.
  1. A separate file has been created for RegexManager. This is preparation for breaking the direct dependency of RegexManager on NetworkFilter.
  1. The functions for finding a filter based on a network request have been moved to a separate file, network_matchers.rs. Also in this commit, loops in the macthing functions have also been replaced with any and all. Additionally, a trait AnyOrExt was introduced, adding an any_or method to iterators. However, this turned out to be an good solution, and in subsequent commits, it was removed in favor of using early returns.
  1. Intermediate fix for project compilation.
  1. First version of the FlatBuffers storage. Additionally, the AnyOrExt trait has been removed, and early return has been used instead.
  1. A url_lower_cased field has been added to the struct Request. This reduces the number of memory allocations during filter lookup and speeds up the matching. A struct Tokens has been introduced, and tokens are now calculated once and stored as a field in the struct Request. This simplifies and speeds up the code. A method checkable_tokens_iter has been added to Request that returns the sequence of tokens that need to be checked. The object-pooling feature has been removed.
  1. The struct NetworkFilterList has been moved from blocker.rs to a separate file, network_filter_list.rs. Corresponding tests have been relocated to tests/unit/network_filter_list.rs
  1. Memory usage in FlatBuffers has been optimized. include and exclude domains now stored as u16 indices instead of u64 hashes in a separate sorted array unique_domains_hashes
  1. Preparation for integrating FlatNetworkFilter into Blocker. Added a NetworkFilterMaskHelper trait, into which the boolean getters have been moved.
  1. Removed copy-paste loops from check, check_all, and filter_exists. Now, there is a single loop iterates over sequence from Request::checkable_tokens_iter().
  1. Blocker has been renamed to GenericBlocker<NetworkFilterListType>, which may depends on the type of list: NetworkFilterList or FlatNetworkFilterList. Added an implementation for FlatNetworkFilterList.
  1. Fixed FlatNetworkFilterList to ensure that the results match in terms of speed and filter lookups with the implementation based on NetworkFilterList.
  1. Improving the speed of filter construction for FlatNetworkFilterList. Rules are optimized during creation. Now FlatNetworkFilterList doesn't depend on NetworkFilterList.
  1. Adding the FlatBuffers feature, fixing tests, and modifying CI steps

@boocmp boocmp self-assigned this Jan 21, 2025
@boocmp boocmp force-pushed the flat_buffers branch 7 times, most recently from e515e7b to 2dca480 Compare January 24, 2025 15:45
boocmp added 21 commits January 29, 2025 13:47
…oved direct deps on NetworkFilter from RegexManager and matchers.
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rust Benchmark

Benchmark suite Current: a2a8f5a Previous: 7919bdd Ratio
rule-match-browserlike/brave-list 1917794686 ns/iter (± 20902834) 1745226241 ns/iter (± 10688991) 1.10
rule-match-first-request/brave-list 1052414 ns/iter (± 13996) 1003256 ns/iter (± 7610) 1.05
blocker_new/brave-list 159015429 ns/iter (± 3655404) 210108247 ns/iter (± 7007989) 0.76
memory-usage/brave-list-initial 21457739 ns/iter (± 3) 41409969 ns/iter (± 3) 0.52
memory-usage/brave-list-after-1000-requests 24064706 ns/iter (± 3) 44005995 ns/iter (± 3) 0.55

This comment was automatically generated by workflow using github-action-benchmark.

@boocmp boocmp marked this pull request as ready for review February 4, 2025 12:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants