Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add benches for process_compute_budget_instructions #2498

Conversation

tao-stones
Copy link

Problem

A bench would be useful as process_compute_budget_instructions() being refactored and changed.

Summary of Changes

  • add benches

Fixes #

Comment on lines 1 to 2
#![feature(test)]
extern crate test;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be nice, in my opinion, to use criterion so we don't need to use nightly AND we can get the throughput numbers in terms of TPS.

See #2262 as a recent example of using criterion for benchmark w/ throughput.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice to do it without nightly, 6f922c6

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recently learned this, but if you use c.benchmark_group you are then able to set a throughput option. This can then make the benches all give us fimiliar TPS numbers

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bench result with Throughput
bench_size_1024/No instructions
                        time:   [3.5170 µs 3.5229 µs 3.5279 µs]
                        thrpt:  [290.26 Melem/s 290.67 Melem/s 291.16 Melem/s]

bench_size_1024/No builtins
                        time:   [16.493 µs 16.504 µs 16.516 µs]
                        thrpt:  [62.002 Melem/s 62.046 Melem/s 62.087 Melem/s]
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) low mild
  1 (1.00%) high mild
  2 (2.00%) high severe

bench_size_1024/Only compute-budget instructions
                        time:   [27.893 µs 27.934 µs 27.978 µs]
                        thrpt:  [36.601 Melem/s 36.658 Melem/s 36.712 Melem/s]
Found 15 outliers among 100 measurements (15.00%)
  4 (4.00%) low mild
  8 (8.00%) high mild
  3 (3.00%) high severe

bench_size_1024/Only builtins
                        time:   [16.859 µs 16.868 µs 16.877 µs]
                        thrpt:  [60.676 Melem/s 60.708 Melem/s 60.739 Melem/s]
Found 17 outliers among 100 measurements (17.00%)
  1 (1.00%) low severe
  2 (2.00%) low mild
  7 (7.00%) high mild
  7 (7.00%) high severe

bench_size_1024/Mixed instructions
                        time:   [447.94 µs 449.48 µs 451.58 µs]
                        thrpt:  [2.2676 Melem/s 2.2782 Melem/s 2.2860 Melem/s]

It'd be better if Criterion supports customizing unit label from Melem/s to M tps 🫠

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the bench_size_1024 is not really that instructive, the number of txs per iteration doesn't mattter afaict (and we really re-use the same one anyway?).

Would probably be more descriptive to have the number of instructions, as that is highly relevant to throughput, right?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels a bit redundant to have to name the fn itself, then benchmark_group, then bench_function. I settled with setting benchmark_group's name to be same as fn name (as each fn bench...() is physically a group), then add instruction count as bench_function name. The report reads better.

bench_process_compute_budget_instructions_empty/0 instructions
                        time:   [3.5387 µs 3.5400 µs 3.5415 µs]
                        thrpt:  [289.15 Melem/s 289.27 Melem/s 289.37 Melem/s]
Found 15 outliers among 100 measurements (15.00%)
  7 (7.00%) high mild
  8 (8.00%) high severe

bench_process_compute_budget_instructions_no_builtins/4 dummy Instructions
                        time:   [14.143 µs 14.150 µs 14.157 µs]
                        thrpt:  [72.331 Melem/s 72.369 Melem/s 72.401 Melem/s]
Found 8 outliers among 100 measurements (8.00%)
  1 (1.00%) high mild
  7 (7.00%) high severe

bench_process_compute_budget_instructions_compute_budgets/4 compute-budget instructions
                        time:   [24.473 µs 24.501 µs 24.535 µs]
                        thrpt:  [41.737 Melem/s 41.795 Melem/s 41.842 Melem/s]
Found 5 outliers among 100 measurements (5.00%)
  1 (1.00%) low mild
  3 (3.00%) high mild
  1 (1.00%) high severe

bench_process_compute_budget_instructions_builtins/4 dummy builtins
                        time:   [14.144 µs 14.150 µs 14.157 µs]
                        thrpt:  [72.333 Melem/s 72.367 Melem/s 72.397 Melem/s]
Found 6 outliers among 100 measurements (6.00%)
  1 (1.00%) high mild
  5 (5.00%) high severe

bench_process_compute_budget_instructions_mixed/1024 mixed instructions
                        time:   [2.5592 ms 2.5645 ms 2.5714 ms]
                        thrpt:  [398.23 Kelem/s 399.30 Kelem/s 400.13 Kelem/s]
Found 10 outliers among 100 measurements (10.00%)
  4 (4.00%) high mild
  6 (6.00%) high severe

@tao-stones tao-stones force-pushed the chore-bench-process-compute-budget-instructions branch from 6f922c6 to 2a6d12a Compare August 9, 2024 22:51
@tao-stones tao-stones force-pushed the chore-bench-process-compute-budget-instructions branch from 3aa5351 to 4274f6c Compare August 13, 2024 16:25
apfitzge
apfitzge previously approved these changes Aug 13, 2024
Copy link

@apfitzge apfitzge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small nit comment on naming, otherwise lgtm

@tao-stones tao-stones requested a review from apfitzge August 13, 2024 23:32
@tao-stones tao-stones added the automerge automerge Merge this Pull Request automatically once CI passes label Aug 13, 2024
}

fn bench_process_compute_budget_instructions_mixed(c: &mut Criterion) {
let num_instructions = 1024;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is unrealistic though - we cannot have this many instructions given packet size. For empty ixs like the dummy ixs you have, the maximum ixs for a valid packet works out to 355 IIRC.

Fine if you don't map out the exact number, but I think we should at least be reasonably close to the actual worst-case instead of 3x larger, wdyt?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did want to stretch the limit, but yea, it's reasonable to stay close to reality.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anza-team anza-team removed the automerge automerge Merge this Pull Request automatically once CI passes label Aug 14, 2024
@anza-team
Copy link
Collaborator

😱 New commits were pushed while the automerge label was present.

@tao-stones tao-stones added the automerge automerge Merge this Pull Request automatically once CI passes label Aug 14, 2024
@tao-stones tao-stones requested a review from apfitzge August 14, 2024 14:46
Copy link

mergify bot commented Aug 14, 2024

automerge label removed due to a CI failure

@mergify mergify bot removed the automerge automerge Merge this Pull Request automatically once CI passes label Aug 14, 2024
@tao-stones tao-stones merged commit 939d7f6 into anza-xyz:master Aug 14, 2024
54 checks passed
@tao-stones tao-stones deleted the chore-bench-process-compute-budget-instructions branch August 14, 2024 17:24
ray-kast pushed a commit to abklabs/agave that referenced this pull request Nov 27, 2024
* add benches for process_compute_budget_instructions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants