#38: Add VM benchmarking #42

apetkov-so · 2025-01-07T21:57:04Z

No description provided.

chip-so · 2025-01-08T21:03:28Z

crates/aranya-policy-vm/src/machine.rs

    pub fn call_seal(
        &mut self,
        name: &str,
        this_data: &Struct,
    ) -> Result<ExitReason, MachineError> {
+        #[cfg(feature = "bench")]
+        self.stopwatch.start("call_seal");


I think it would be useful to add the command name, especially since we're fixing the ability to publish multiple commands in an action.

Done. BTW, I figured out why call_seal/call_open time measurements don't complete... It's because run collects benchmarking results when it exits, and this happens before the stopwatch.stop() calls from the call_seal/call_open. All other stopwatch calls happen inside a run, but seal/open are exceptions.

Edit: I'm thinking about just not timing the seal/open functions. They essentially just make FFI calls (via run), which are already timed.

crates/aranya-runtime/benches/lib.rs

chip-so · 2025-01-10T22:24:18Z

crates/aranya-runtime/benches/lib.rs

@@ -0,0 +1,138 @@
+#[cfg(feature = "bench")]
+#[test]


Shouldn't this be #[bench]? How do you actually run this? If I do cargo bench it doesn't run because it's not #[bench] and if I do cargo test it doesn't run it because it doesn't look in the benches dir.

Looks like it will only run with cargo test if the "bench" feature flag is set. Agree it should just be #[bench].

I didn't realize you can use features like this. The test runs when the bench feature is enabled in aranya-runtime/Cargo.toml. There's probably a better way, that I don't know.

Yeah, you need the bench feature, but that doesn't make it run benches under cargo test. You need --benches for that. I figured it out - to get this to run and print stats you need to do:

cargo test --benches --features=bench -p aranya-runtime -- --nocapture

Apparently #[feature] is unstable.

chip-so

Here's some example output from my testing:

----- Benchmark Results: -----
setup_action: init: (1 samples), best: 4µs, worst: 4µs, mean: 4µs, SD: 0ns
query.start: (1 samples), best: 2.958µs, worst: 2.958µs, mean: 2.958µs, SD: 0ns
setup_command: Insert: (9 samples), best: 2.333µs, worst: 2.959µs, mean: 2.463µs, SD: 183ns
setup_command: DoSomething: (1 samples), best: 1.875µs, worst: 1.875µs, mean: 1.875µs, SD: 0ns
publish: (11 samples), best: 875ns, worst: 3.083µs, mean: 1.277µs, SD: 631ns
extcall: (22 samples), best: 292ns, worst: 6.333µs, mean: 1.153µs, SD: 1.248µs
setup_command: Init: (1 samples), best: 875ns, worst: 875ns, mean: 875ns, SD: 0ns
query.next: (10 samples), best: 333ns, worst: 2.541µs, mean: 721ns, SD: 613ns
create: (9 samples), best: 375ns, worst: 2.25µs, mean: 643ns, SD: 572ns
validate_fact_literal: (1 samples), best: 417ns, worst: 417ns, mean: 417ns, SD: 0ns
setup_action: run: (1 samples), best: 334ns, worst: 334ns, mean: 334ns, SD: 0ns
serialize: (11 samples), best: 166ns, worst: 833ns, mean: 318ns, SD: 208ns
deserialize: (11 samples), best: 125ns, worst: 791ns, mean: 299ns, SD: 184ns
setup_action: insert: (9 samples), best: 208ns, worst: 750ns, mean: 287ns, SD: 166ns
fact.kset: (9 samples), best: 125ns, worst: 708ns, mean: 199ns, SD: 180ns
struct.get: (18 samples), best: 125ns, worst: 458ns, mean: 171ns, SD: 75ns
validate_struct_schema: (11 samples), best: 83ns, worst: 459ns, mean: 163ns, SD: 140ns
struct.new: (11 samples), best: 41ns, worst: 1.208µs, mean: 155ns, SD: 333ns
struct.set: (18 samples), best: 83ns, worst: 375ns, mean: 150ns, SD: 60ns
def: (51 samples), best: 41ns, worst: 1.375µs, mean: 145ns, SD: 245ns
return: (33 samples), best: 0ns, worst: 584ns, mean: 138ns, SD: 118ns
get: (58 samples), best: 41ns, worst: 417ns, mean: 137ns, SD: 70ns
fact.vset: (9 samples), best: 83ns, worst: 208ns, mean: 116ns, SD: 38ns
call: (22 samples), best: 41ns, worst: 750ns, mean: 114ns, SD: 141ns
fact.new: (10 samples), best: 41ns, worst: 458ns, mean: 88ns, SD: 124ns
end: (21 samples), best: 41ns, worst: 250ns, mean: 85ns, SD: 53ns
block: (21 samples), best: 0ns, worst: 125ns, mean: 61ns, SD: 37ns
branch: (10 samples), best: 0ns, worst: 167ns, mean: 50ns, SD: 40ns
meta:: (109 samples), best: 0ns, worst: 166ns, mean: 50ns, SD: 23ns
jump: (9 samples), best: 0ns, worst: 84ns, mean: 37ns, SD: 23ns
exit: (33 samples), best: 0ns, worst: 42ns, mean: 29ns, SD: 19ns

This is kind of a hodgepodge. Lots of things are being benchmarked here but it's really hard to separate out individual instruction timing versus other internal operations. Some thoughts on how to improve this:

Improve BenchMeasurements/BenchStats so that stats of interest can be filtered. i.e. some method in BenchMeasurements that consumes self and returns a new BenchMeasurements with only the stats you want. Would be really slick if you could select by category (e.g. "instructions" versus "validation" versus "setup")
Improve output with columns to make it more readable (maybe look at table_formatter or tablestream.
output this data in some kind of format that can be consumed by other tools (probably CSV or JSON)

chip-so · 2025-01-28T20:38:02Z

 Name                         # Samples  Best     Worst     Mean     SD
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 setup_action: init                   2    1.5µs      75µs  38.25µs  36.75µs
 setup_command: Increment             1  7.416µs   7.416µs  7.416µs      0ns
 setup_command: Create                1  4.875µs   4.875µs  4.875µs      0ns
 query                                1  4.792µs   4.792µs  4.792µs      0ns
 setup_command: Insert                9  4.208µs   4.625µs  4.343µs    131ns
 setup_action: create_action          1  3.917µs   3.917µs  3.917µs      0ns

This is much nicer.

chip-so

There's a lot of other things I'd like to improve, but I think this needs to be done so we can move on.

crates/aranya-policy-vm/Cargo.toml

…tate exit - and aggregates benchmarking results - before the calling functions can stop the timers.

apetkov-so self-assigned this Jan 7, 2025

apetkov-so requested review from jdygert-spok, elagergren-spideroak and chip-so as code owners January 7, 2025 21:57

apetkov-so linked an issue Jan 7, 2025 that may be closed by this pull request

Create VM benchmarking suite #38

Open

apetkov-so added the vm Aranya policy lang VM label Jan 7, 2025

chip-so reviewed Jan 8, 2025

View reviewed changes

apetkov-so force-pushed the 38-create-vm-benchmarking-suite branch from d851bdf to 51fabe2 Compare January 10, 2025 20:46

chip-so reviewed Jan 10, 2025

View reviewed changes

apetkov-so force-pushed the 38-create-vm-benchmarking-suite branch from 72de591 to f50cdeb Compare January 22, 2025 19:16

chip-so previously approved these changes Jan 28, 2025

View reviewed changes

apetkov-so dismissed chip-so’s stale review via 5448644 January 30, 2025 17:13

apetkov-so enabled auto-merge (squash) January 30, 2025 17:15

jdygert-spok reviewed Feb 3, 2025

View reviewed changes

crates/aranya-policy-vm/Cargo.toml Outdated Show resolved Hide resolved

apetkov-so force-pushed the 38-create-vm-benchmarking-suite branch from 5448644 to 7dcbfec Compare February 3, 2025 20:35

apetkov-so added 5 commits February 5, 2025 09:43

#38: Add VM benchmarking

3cf12dd

Remove benchmarking from call_seal, call_open, because their runs…

4de227c

…tate exit - and aggregates benchmarking results - before the calling functions can stop the timers.

Fix fmt.

e4f737a

Format benchmarking stats into a table for better readability.

99938a3

Make table_formatter dependent on the bench feature.

654422d

apetkov-so force-pushed the 38-create-vm-benchmarking-suite branch from 7dcbfec to 654422d Compare February 5, 2025 18:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

#38: Add VM benchmarking #42

#38: Add VM benchmarking #42

apetkov-so commented Jan 7, 2025

chip-so Jan 8, 2025

apetkov-so Jan 10, 2025 •

edited

Loading

chip-so Jan 10, 2025

gknopf-aranya Jan 10, 2025

apetkov-so Jan 10, 2025 •

edited

Loading

chip-so Jan 10, 2025

apetkov-so Jan 10, 2025

chip-so left a comment

chip-so commented Jan 28, 2025

chip-so left a comment

#38: Add VM benchmarking #42

Are you sure you want to change the base?

#38: Add VM benchmarking #42

Conversation

apetkov-so commented Jan 7, 2025

chip-so Jan 8, 2025

Choose a reason for hiding this comment

apetkov-so Jan 10, 2025 • edited Loading

Choose a reason for hiding this comment

chip-so Jan 10, 2025

Choose a reason for hiding this comment

gknopf-aranya Jan 10, 2025

Choose a reason for hiding this comment

apetkov-so Jan 10, 2025 • edited Loading

Choose a reason for hiding this comment

chip-so Jan 10, 2025

Choose a reason for hiding this comment

apetkov-so Jan 10, 2025

Choose a reason for hiding this comment

chip-so left a comment

Choose a reason for hiding this comment

chip-so commented Jan 28, 2025

chip-so left a comment

Choose a reason for hiding this comment

apetkov-so Jan 10, 2025 •

edited

Loading

apetkov-so Jan 10, 2025 •

edited

Loading