Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flat buffers #416

Open
wants to merge 45 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
5d2f0d8
unit tests from network.rs moved to tests/unit/filters/network.rs
boocmp Jan 8, 2025
2892b25
unit tests from cosmetic.rs moved to tests/unit/filters/cosmetic.rs
boocmp Jan 8, 2025
ae41283
Removed unused files from url_parser.
boocmp Jan 8, 2025
b61adfe
unit tests from resource_storage.rs moved to tests/unit/filters/resou…
boocmp Jan 8, 2025
0d10ae9
unit tests from resource_assembler.rs moved to tests/unit/filters/res…
boocmp Jan 8, 2025
168ad77
unit tests from blocker.rs moved to tests/unit/blocker.rs
boocmp Jan 8, 2025
49e74da
unit tests from regex_manager.rs moved to tests/unit/regex_manager.rs
boocmp Jan 8, 2025
f586683
unit tests from content_blocking.rs moved to tests/unit/content_block…
boocmp Jan 8, 2025
e236f2e
unit tests from cosmetic_filter_cache.rs moved to tests/unit/cosmetic…
boocmp Jan 8, 2025
cbfd5d5
unit tests from engine.rs moved to tests/unit/engine.rs
boocmp Jan 8, 2025
0972777
unit tests from lists.rs moved to tests/unit/lists.rs
boocmp Jan 8, 2025
34ff68a
unit tests from request.rs moved to tests/unit/request.rs
boocmp Jan 8, 2025
140fba3
unit tests from optimizer.rs moved to tests/unit/optimizer.rs
boocmp Jan 8, 2025
9558e26
unit tests from utils.rs moved to tests/unit/utils.rs
boocmp Jan 8, 2025
a6fd9a5
Rust formatter.
boocmp Jan 21, 2025
39ffc44
AbstractNetworkFilter moved to abstract_network.rs.
boocmp Jan 8, 2025
175698e
The regex stuff moved from network.rs to regex_manager.rs
boocmp Jan 9, 2025
f837228
Key type of compiled regexes map changed to u64.
boocmp Jan 9, 2025
1a41937
Filter matching functions have beed moved to network_matchers.rs. Rem…
boocmp Jan 9, 2025
a9f3a71
Unit tests compilation and Rust-fmt. Matchers tests moved from networ…
boocmp Jan 9, 2025
1c979ce
Added flatbuffer network filters implementation.
boocmp Jan 9, 2025
5c976b4
Added url_lower_cased member in request to prevent the memory allocat…
boocmp Jan 9, 2025
7403547
Removed "object-pooling". request_tokens is a part of Request now.
boocmp Jan 12, 2025
f42ffed
Added checkable_tokens_iter
boocmp Jan 14, 2025
3e323b9
Tests compilation.
boocmp Jan 21, 2025
a03a531
NetworkFilterList moved to network_filter_list.rs
boocmp Jan 21, 2025
a5c4c1c
Added flatbuffer structure for network filters.
boocmp Jan 15, 2025
94a3382
Added NetworkFilterMaskHelper trait to provide bool getters for diffe…
boocmp Jan 16, 2025
7a6476c
Fixed fb_network.rs to match flat schema.
boocmp Jan 21, 2025
669ef56
Added NetworkFilterListTrait to provide flat impl later.
boocmp Jan 21, 2025
df8828b
Added list type generic parameter to the Blocker.
boocmp Jan 21, 2025
f98c9e1
Simplified NetworkFilterList impl.
boocmp Jan 21, 2025
f239499
Added FlatNetworkFilterList. Added FlatNetworkFilter. Implemented mat…
boocmp Jan 21, 2025
b66d65f
Temporary using NetworkFilterList for filters optimization.
boocmp Jan 21, 2025
fff0b48
Fixed filter's unique key for regex manager.
boocmp Jan 23, 2025
81a9343
Optimization for FlatNetworkFilterList.
boocmp Jan 23, 2025
92fcc49
Added check for flatbuffers feature.
boocmp Jan 24, 2025
bc8d2e3
Wrong conflicts due the rebase. Fixed.
boocmp Jan 29, 2025
880d83c
Rust fmt after rebase.
boocmp Jan 29, 2025
d2f45e5
Enable flatbuffers feature in perf CI
atuchin-m Jan 29, 2025
09b7b88
Added Serialize trait for engine.
boocmp Jan 29, 2025
7b5cfcb
Tests fixing and disabling.
boocmp Jan 30, 2025
a9521b8
Added build & test steps in GHWF.
boocmp Jan 30, 2025
a3159df
Disabled serialization tests for flatbuffers feature.
boocmp Jan 30, 2025
a2a8f5a
Changed 'Run Brave-specific tests' CI step.
boocmp Jan 30, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 8 additions & 2 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,9 @@ jobs:
- name: Cargo build 'adblock' package
run: cargo build --all-features --all-targets

- name: Cargo build 'adblock' package (default features)
run: cargo build --all-targets

- name: Cargo build 'adblock' package (no default features)
run: cargo build --no-default-features --all-targets

Expand Down Expand Up @@ -61,6 +64,9 @@ jobs:
- name: Cargo test 'adblock' package
run: cargo test --all-features --tests --no-fail-fast

- name: Cargo test 'adblock' package (default features)
run: cargo test --tests --no-fail-fast

- name: Cargo test 'adblock' package (no default features)
run: cargo test --no-default-features --features embedded-domain-resolver,full-regex-handling --tests --no-fail-fast

Expand All @@ -79,7 +85,7 @@ jobs:
# This hackily checks that the filter is working.
# If this check fails, something might have been renamed inadvertantly.
echo "Ensure that '$TEST_NAME_FILTER' still matches exactly 2 tests."
cargo test --all-features --test live --no-fail-fast -- --ignored "$TEST_NAME_FILTER" --list | grep "2 tests, 0 benchmarks"
cargo test --test live --no-fail-fast -- --ignored "$TEST_NAME_FILTER" --list | grep "2 tests, 0 benchmarks"

# Now run the tests
cargo test --all-features --test live --no-fail-fast -- --ignored "$TEST_NAME_FILTER"
cargo test --test live --no-fail-fast -- --ignored "$TEST_NAME_FILTER"
8 changes: 4 additions & 4 deletions .github/workflows/perf-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,16 +26,16 @@ jobs:
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4

- name: Bench network filter matching
run: cargo bench --bench bench_matching rule-match-browserlike/brave-list -- --output-format bencher | tee -a output.txt
run: cargo bench --bench bench_matching --features flatbuffers rule-match-browserlike/brave-list -- --output-format bencher | tee -a output.txt

- name: Bench first request matching delay
run: cargo bench --bench bench_matching rule-match-first-request -- --output-format bencher | tee -a output.txt
run: cargo bench --bench bench_matching --features flatbuffers rule-match-first-request -- --output-format bencher | tee -a output.txt

- name: Bench startup speed
run: cargo bench --bench bench_rules blocker_new/brave-list -- --output-format bencher | tee -a output.txt
run: cargo bench --bench bench_rules --features flatbuffers blocker_new/brave-list -- --output-format bencher | tee -a output.txt

- name: Bench memory usage
run: cargo bench --bench bench_memory -- --output-format bencher | tee -a output.txt
run: cargo bench --bench bench_memory --features flatbuffers -- --output-format bencher | tee -a output.txt

- name: Store benchmark result
uses: benchmark-action/github-action-benchmark@d48d326b4ca9ba73ca0cd0d59f108f9e02a381c7 # v1.20.4
Expand Down
11 changes: 11 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 5 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ cssparser = { version = "0.28", optional = true }
selectors = { version = "0.23", optional = true }
serde_json = "1.0"
thiserror = "1.0"
flatbuffers = "24.12.23"

[dev-dependencies]
criterion = "0.5"
Expand All @@ -55,6 +56,9 @@ sha2 = "0.9"
[lib]
bench = false

[profile.bench]
debug = true

[[bench]]
name = "bench_regex"
harness = false
Expand Down Expand Up @@ -98,3 +102,4 @@ css-validation = ["cssparser", "selectors"]
content-blocking = []
embedded-domain-resolver = ["addr"] # Requires setting an external domain resolver if disabled.
resource-assembler = []
flatbuffers = []
16 changes: 2 additions & 14 deletions benches/bench_cosmetic_matching.rs
Original file line number Diff line number Diff line change
Expand Up @@ -57,13 +57,7 @@ fn by_classes_ids(c: &mut Criterion) {
let (_, cosmetic_filters) = parse_filters(&rules, false, FilterFormat::Standard);
let cfcache = CosmeticFilterCache::from_rules(cosmetic_filters);
let exceptions = Default::default();
b.iter(|| {
cfcache.hidden_class_id_selectors(
&["ad"],
&["ad"],
&exceptions,
)
})
b.iter(|| cfcache.hidden_class_id_selectors(&["ad"], &["ad"], &exceptions))
});
group.bench_function("many lists", move |b| {
let rules = rules_from_lists(&[
Expand All @@ -75,13 +69,7 @@ fn by_classes_ids(c: &mut Criterion) {
let (_, cosmetic_filters) = parse_filters(&rules, false, FilterFormat::Standard);
let cfcache = CosmeticFilterCache::from_rules(cosmetic_filters);
let exceptions = Default::default();
b.iter(|| {
cfcache.hidden_class_id_selectors(
&["ad"],
&["ad"],
&exceptions,
)
})
b.iter(|| cfcache.hidden_class_id_selectors(&["ad"], &["ad"], &exceptions))
});
group.bench_function("many matching classes and ids", move |b| {
let rules = rules_from_lists(&[
Expand Down
66 changes: 29 additions & 37 deletions benches/bench_matching.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,11 @@ use criterion::*;

use serde::{Deserialize, Serialize};

use adblock::Engine;
use adblock::blocker::{Blocker, BlockerOptions};
use adblock::request::Request;
use adblock::resources::ResourceStorage;
use adblock::url_parser::parse_url;
use adblock::{Engine, Serialize as _};

#[path = "../tests/test_utils.rs"]
mod test_utils;
Expand Down Expand Up @@ -36,7 +36,7 @@ fn load_requests() -> Vec<TestRequest> {
reqs
}

fn get_blocker(rules: impl IntoIterator<Item=impl AsRef<str>>) -> Blocker {
fn get_blocker(rules: impl IntoIterator<Item = impl AsRef<str>>) -> Blocker {
let (network_filters, _) = adblock::lists::parse_filters(rules, false, Default::default());

let blocker_options = BlockerOptions {
Expand All @@ -61,7 +61,11 @@ fn bench_rule_matching(engine: &Engine, requests: &Vec<TestRequest>) -> (u32, u3
(matches, passes)
}

fn bench_matching_only(blocker: &Blocker, resources: &ResourceStorage, requests: &Vec<Request>) -> (u32, u32) {
fn bench_matching_only(
blocker: &Blocker,
resources: &ResourceStorage,
requests: &Vec<Request>,
) -> (u32, u32) {
let mut matches = 0;
let mut passes = 0;
requests.iter().for_each(|parsed| {
Expand All @@ -78,10 +82,7 @@ fn bench_matching_only(blocker: &Blocker, resources: &ResourceStorage, requests:

type ParsedRequest = (String, String, String, String, bool);

fn bench_rule_matching_browserlike(
blocker: &Engine,
requests: &Vec<ParsedRequest>,
) -> (u32, u32) {
fn bench_rule_matching_browserlike(blocker: &Engine, requests: &Vec<ParsedRequest>) -> (u32, u32) {
let mut matches = 0;
let mut passes = 0;
requests.iter().for_each(
Expand Down Expand Up @@ -141,9 +142,7 @@ fn rule_match(c: &mut Criterion) {
fn rule_match_parsed_el(c: &mut Criterion) {
let mut group = c.benchmark_group("rule-match-parsed");

let rules = rules_from_lists(&[
"data/easylist.to/easylist/easylist.txt",
]);
let rules = rules_from_lists(&["data/easylist.to/easylist/easylist.txt"]);
let requests = load_requests();
let requests_parsed: Vec<_> = requests
.into_iter()
Expand Down Expand Up @@ -221,9 +220,7 @@ fn serialization(c: &mut Criterion) {
b.iter(|| assert!(engine.serialize_raw().unwrap().len() > 0))
});
group.bench_function("el", move |b| {
let full_rules = rules_from_lists(&[
"data/easylist.to/easylist/easylist.txt",
]);
let full_rules = rules_from_lists(&["data/easylist.to/easylist/easylist.txt"]);

let engine = Engine::from_rules(full_rules, Default::default());
b.iter(|| assert!(engine.serialize_raw().unwrap().len() > 0))
Expand Down Expand Up @@ -258,9 +255,7 @@ fn deserialization(c: &mut Criterion) {
})
});
group.bench_function("el", move |b| {
let full_rules = rules_from_lists(&[
"data/easylist.to/easylist/easylist.txt",
]);
let full_rules = rules_from_lists(&["data/easylist.to/easylist/easylist.txt"]);

let engine = Engine::from_rules(full_rules, Default::default());
let serialized = engine.serialize_raw().unwrap();
Expand Down Expand Up @@ -294,9 +289,7 @@ fn rule_match_browserlike_comparable(c: &mut Criterion) {
group.throughput(Throughput::Elements(requests_len));
group.sample_size(20);

fn requests_parsed(
requests: &[TestRequest],
) -> Vec<(String, String, String, String, bool)> {
fn requests_parsed(requests: &[TestRequest]) -> Vec<(String, String, String, String, bool)> {
requests
.iter()
.map(|r| {
Expand Down Expand Up @@ -354,10 +347,10 @@ fn rule_match_browserlike_comparable(c: &mut Criterion) {
b.iter(|| bench_rule_matching_browserlike(&engine, &requests))
});
group.bench_function("brave-list", |b| {
let rules = rules_from_lists(&["data/brave/brave-main-list.txt"]);
let engine = Engine::from_rules_parametrised(rules, Default::default(), false, true);
b.iter(|| bench_rule_matching_browserlike(&engine, &requests))
});
let rules = rules_from_lists(&["data/brave/brave-main-list.txt"]);
let engine = Engine::from_rules_parametrised(rules, Default::default(), false, true);
b.iter(|| bench_rule_matching_browserlike(&engine, &requests))
});

group.finish();
}
Expand All @@ -376,21 +369,20 @@ fn rule_match_first_request(c: &mut Criterion) {
)];

group.bench_function("brave-list", |b| {
b.iter_custom(
|iters| {
let mut total_time = std::time::Duration::ZERO;
for _ in 0..iters {
let rules = rules_from_lists(&["data/brave/brave-main-list.txt"]);
let engine = Engine::from_rules_parametrised(rules, Default::default(), false, true);

// Measure only the matching time, skip setup and destruction
let start_time = std::time::Instant::now();
bench_rule_matching_browserlike(&engine, &requests);
total_time += start_time.elapsed();
}
total_time
b.iter_custom(|iters| {
let mut total_time = std::time::Duration::ZERO;
for _ in 0..iters {
let rules = rules_from_lists(&["data/brave/brave-main-list.txt"]);
let engine =
Engine::from_rules_parametrised(rules, Default::default(), false, true);

// Measure only the matching time, skip setup and destruction
let start_time = std::time::Instant::now();
bench_rule_matching_browserlike(&engine, &requests);
total_time += start_time.elapsed();
}
)
total_time
})
});

group.finish();
Expand Down
32 changes: 16 additions & 16 deletions benches/bench_memory.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,12 @@
* You can obtain one at https://mozilla.org/MPL/2.0/. */

use criterion::*;
use serde::{Deserialize, Serialize};
use std::alloc::{GlobalAlloc, Layout, System};
use std::sync::atomic::{AtomicUsize, Ordering};
use serde::{Deserialize, Serialize};

use adblock::Engine;
use adblock::request::Request;
use adblock::Engine;

#[path = "../tests/test_utils.rs"]
mod test_utils;
Expand Down Expand Up @@ -110,15 +110,15 @@ fn bench_memory_usage(c: &mut Criterion) {
let mut result = 0;
b.iter_custom(|iters| {
for _ in 0..iters {
ALLOCATOR.reset();
let rules = rules_from_lists(&["data/brave/brave-main-list.txt"]);
let engine = Engine::from_rules(rules, Default::default());
ALLOCATOR.reset();
let rules = rules_from_lists(&["data/brave/brave-main-list.txt"]);
let engine = Engine::from_rules(rules, Default::default());

noise += 1; // add some noise to make criterion happy
result += ALLOCATOR.current_usage() + noise;
noise += 1; // add some noise to make criterion happy
result += ALLOCATOR.current_usage() + noise;

// Prevent engine from being optimized
criterion::black_box(&engine);
// Prevent engine from being optimized
criterion::black_box(&engine);
}

// Return the memory usage as a Duration
Expand All @@ -134,15 +134,15 @@ fn bench_memory_usage(c: &mut Criterion) {
let rules = rules_from_lists(&["data/brave/brave-main-list.txt"]);
let engine = Engine::from_rules(rules, Default::default());

for request in first_1000_requests.clone() {
criterion::black_box(engine.check_network_request(&request.into()));
}
for request in first_1000_requests.clone() {
criterion::black_box(engine.check_network_request(&request.into()));
}

noise += 1; // add some noise to make criterion happy
result += ALLOCATOR.current_usage() + noise;
noise += 1; // add some noise to make criterion happy
result += ALLOCATOR.current_usage() + noise;

// Prevent engine from being optimized
criterion::black_box(&engine);
// Prevent engine from being optimized
criterion::black_box(&engine);
}

// Return the memory usage as a Duration
Expand Down
Loading