-
Notifications
You must be signed in to change notification settings - Fork 264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix reserve minimal compute units for builtins #3799
Fix reserve minimal compute units for builtins #3799
Conversation
e416a8f
to
6bbd650
Compare
77381a1
to
8faee07
Compare
8faee07
to
c181481
Compare
0e8a348
to
9192989
Compare
compute_budget_instruction_details.num_non_compute_budget_instructions, | ||
1 | ||
); | ||
match filter.get_program_kind(instruction.program_id_index as usize, program_id) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We really don't need to do all this builtin program accounting unless the compute limit isn't specified by the transaction. How about we move this all into calculate_default_compute_unit_limit
and re-iterate through tx instructions there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you agree with moving this to calculate_default_compute_unit_limit
, I've implemented the pre-requisite PR (#3853) to unblock adding an instructions iterator param to sanitize_and_convert_to_compute_budget_limits
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm leaning towards this instead of storing a Vec
. Re-iterating is likely less costly than allocation, and we can remove the re-iteration once the feature is activated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We really don't need to do all this builtin program accounting unless the compute limit isn't specified by the transaction.
This is great observation. Another possibility is to re-iterate in try_from
when necessary. Something like this:
struct builtin_instruction_details {
num_builtin_instructions: 0,
num_non_builtin_instructions: 0,
migrating_builtin: vec![0; MIGRATION_FEATURES_ID.len()],
}
fn try_from( ixs: iter ) -> Result<Self> {
// original iteration:
// to count `num_compute_budget_instructions` and process_compute_budget_instruction if present
// possible second iter
if requested_compute_unit_limit.is_none() {
// iterate again to create "builtin_instruction_details"
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, based on benchmarking, the cost of allocating vec with fixed size of 3 is much lower than the cost of hashing Pubkey to lookup at BUILTIN_INSTRUCTION_COSTS
. Re-iterate within try_from
will reuse cached ComputeBudgetProgramIdFilter
, avoiding re-hashing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wdyt something like this: tao-stones@86fcaeb (edited)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wdyt something like this: tao-stones@86fcaeb (edited)
That does fix the issue where we were unnecessarily calculating builtin counts even when the compute unit limit was specified by the transaction. But I think this implementation is too convoluted. I prefer the code to be simpler, especially if we are planning to backport this change.
Also, based on benchmarking, the cost of allocating vec with fixed size of 3 is much lower than the cost of hashing Pubkey to lookup at BUILTIN_INSTRUCTION_COSTS. Re-iterate within try_from will reuse cached ComputeBudgetProgramIdFilter, avoiding re-hashing.
I don't think we can purely rely on benchmarks for making this decision. The extra allocation is far more more expensive while the validator is running than in an isolated benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That does fix the issue where we were unnecessarily calculating builtin counts even when the compute unit limit was specified by the transaction. But I think this implementation is too convoluted. I prefer the code to be simpler, especially if we are planning to backport this change.
One thing that didn't make sense about your commit is that we are still calculating num_non_compute_budget_instructions
. Similar to the builtin_instruction_details
field you added, it only needs to be calculated if the tx didn't specify an explicit compute unit limit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a note that num_non_compute_budget_instructions
is byproduct of first iteration when checking is_compute_budget_program()
, so in case cu-limit is not requested and feature is not activated (eg. current behavior), it's ready to calc default cu limit by num_non_compute_budget_instructions * 200K
.
if feature_set.is_active(&feature_set::reserve_minimal_cus_for_builtin_instructions::id()) { | ||
// evaluate MigratingBuiltin | ||
let (additional_num_non_builtin_instructions, additional_num_builtin_instructions) = | ||
self.migrating_builtin.iter().enumerate().fold( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally would prefer that we re-iterate over instructions to count builtins rather than increasing the amount of memory used per transaction with all of the new accounting related fields you've added to ComputeBudgetInstructionDetails
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At one point, we found iteration is relatively costly, so opt to trade memory (additional counters) with single-iteration. Now with #3853 merged, and need of handling migration (now or in near future), I can see re-iter transaction's instructions here (after checked cu-limit not requested, and reserve_minimal_cus_for_builtin_instructions
is active) makes sense. Looks like @apfitzge is also onboard. I'll prep a PR for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think it depends on how we store the additional mem. If an extra u16 or 2 per transaction - that may be worth avoiding re-iteration, but we should probably measure. I think as features get activated we may be able to remove both the additional mem or re-iteration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#3899 is the re-iteration version of implementation, it does not support builtin migration, but I don't think adding that will be a big lift. Need to have tests updated to supply iter to sanitize_and_convert_to_compute_budget_limits()
, after that I'll put a separate commit for migration to compare.
9192989
to
96da7c6
Compare
I'm now thinking it was a mistake to have this SIMD-0170 implementation PR also handle the builtin program migration case. How about we push that off for now since those migration feature gates aren't scheduled yet? Then this PR can just focus on making the cost tracker and compute limit agree on 3k cu's per builtin. The extra allocation for builtin migration tracking can be figured out later since we don't need to backport that part. |
96da7c6
to
c2f7271
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -44,10 +50,36 @@ impl ComputeBudgetInstructionDetails { | |||
} | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only when Compute Unite Limit is not requested to re-iterate instructions to collect builtin program counts. This bit of optimization is because is_compute_budget_program()
is cheap, get_program_kind()
on the other hand does additional HashMap lookup by Pubkey; Saving unnecessary calls to get_program_kind()
perhaps justifies additional iteration.
c2f7271
to
7251345
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm ok with the PR as is but added some nits
solana_sdk::compute_budget::check_id(program_id) | ||
} | ||
|
||
#[inline] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: shouldn't we only be using inline
for one-liners for the most part?
|
||
if is_builtin_program(program_id) { | ||
ProgramKind::Builtin { | ||
is_compute_budget: solana_sdk::compute_budget::check_id(program_id), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: technically we would know that this isn't the compute budget program already because otherwise the filter would have had a compute budget builtin entry that was populated by the first filter pass
- Implement SIMD-170;
Co-authored-by: Justin Starry <[email protected]>
36a2346
to
2c7a7cf
Compare
let fee_budget_limits = FeeBudgetLimits::from(process_compute_budget_instructions( | ||
message.program_instructions_iter(), | ||
&bank.feature_set, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Split out #3922 - we can use cached value here.
Will wait on merge/review of that until this PR is merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm - please wait on @jstarry's confirmation as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving for @anza-xyz/svm
Backports to the stable branch are to be avoided unless absolutely necessary for fixing bugs, security issues, and perf regressions. Changes intended for backport should be structured such that a minimum effective diff can be committed separately from any refactoring, plumbing, cleanup, etc that are not strictly necessary to achieve the goal. Any of the latter should go only into master and ride the normal stabilization schedule. |
Backports to the beta branch are to be avoided unless absolutely necessary for fixing bugs, security issues, and perf regressions. Changes intended for backport should be structured such that a minimum effective diff can be committed separately from any refactoring, plumbing, cleanup, etc that are not strictly necessary to achieve the goal. Any of the latter should go only into master and ride the normal stabilization schedule. Exceptions include CI/metrics changes, CLI improvements and documentation updates on a case by case basis. |
- Add feature gate, issue #2562; - Implement SIMD-170; --------- Co-authored-by: Justin Starry <[email protected]> (cherry picked from commit 3e9af14) # Conflicts: # builtins-default-costs/src/lib.rs # compute-budget/src/compute_budget_limits.rs # compute-budget/src/compute_budget_processor.rs # core/src/banking_stage/consumer.rs # core/src/banking_stage/immutable_deserialized_packet.rs # core/src/banking_stage/transaction_scheduler/receive_and_buffer.rs # cost-model/src/cost_model.rs # cost-model/src/transaction_cost.rs # programs/compute-budget-bench/benches/compute_budget.rs # programs/sbf/tests/programs.rs # runtime-transaction/benches/process_compute_budget_instructions.rs # runtime-transaction/src/compute_budget_instruction_details.rs # runtime-transaction/src/compute_budget_program_id_filter.rs # runtime-transaction/src/lib.rs # runtime-transaction/src/runtime_transaction.rs # runtime-transaction/src/runtime_transaction/sdk_transactions.rs # runtime/src/bank.rs # runtime/src/bank/tests.rs # runtime/src/prioritization_fee_cache.rs # sdk/src/feature_set.rs # svm-transaction/src/svm_message.rs # svm-transaction/src/svm_message/sanitized_message.rs # svm-transaction/src/svm_message/sanitized_transaction.rs # svm/src/transaction_processor.rs # transaction-view/src/resolved_transaction_view.rs # transaction-view/src/transaction_view.rs
- Add feature gate, issue #2562; - Implement SIMD-170; --------- Co-authored-by: Justin Starry <[email protected]> (cherry picked from commit 3e9af14) # Conflicts: # builtins-default-costs/src/lib.rs # core/src/banking_stage/consumer.rs # core/src/banking_stage/immutable_deserialized_packet.rs # core/src/banking_stage/transaction_scheduler/receive_and_buffer.rs # cost-model/src/cost_model.rs # programs/compute-budget-bench/benches/compute_budget.rs # runtime-transaction/src/lib.rs # runtime-transaction/src/runtime_transaction/sdk_transactions.rs # runtime/src/prioritization_fee_cache.rs
- Add feature gate, issue #2562; - Implement SIMD-170; --------- Co-authored-by: Justin Starry <[email protected]> (cherry picked from commit 3e9af14) # Conflicts: # builtins-default-costs/src/lib.rs # core/src/banking_stage/consumer.rs # core/src/banking_stage/immutable_deserialized_packet.rs # core/src/banking_stage/transaction_scheduler/receive_and_buffer.rs # cost-model/src/cost_model.rs # programs/compute-budget-bench/benches/compute_budget.rs # runtime-transaction/src/lib.rs # runtime-transaction/src/runtime_transaction/sdk_transactions.rs # runtime/src/prioritization_fee_cache.rs
chat, please merge the SIMDs before the implementations |
- Add feature gate, issue #2562; - Implement SIMD-170; --------- Co-authored-by: Justin Starry <[email protected]> (cherry picked from commit 3e9af14) # Conflicts: # builtins-default-costs/src/lib.rs # core/src/banking_stage/consumer.rs # core/src/banking_stage/immutable_deserialized_packet.rs # core/src/banking_stage/transaction_scheduler/receive_and_buffer.rs # cost-model/src/cost_model.rs # programs/compute-budget-bench/benches/compute_budget.rs # runtime-transaction/src/lib.rs # runtime-transaction/src/runtime_transaction/sdk_transactions.rs # runtime/src/prioritization_fee_cache.rs
- Add feature gate, issue #2562; - Implement SIMD-170; --------- Co-authored-by: Justin Starry <[email protected]> (cherry picked from commit 3e9af14) # Conflicts: # builtins-default-costs/src/lib.rs # core/src/banking_stage/consumer.rs # core/src/banking_stage/immutable_deserialized_packet.rs # core/src/banking_stage/transaction_scheduler/receive_and_buffer.rs # cost-model/src/cost_model.rs # programs/compute-budget-bench/benches/compute_budget.rs # runtime-transaction/src/lib.rs # runtime-transaction/src/runtime_transaction/sdk_transactions.rs # runtime/src/prioritization_fee_cache.rs
Problem
Implementing solana-foundation/solana-improvement-documents#170 by defining MAX_BUILTIN_ALLOCATION_COMPUTE_UNIT_LIMIT to 3K CUs, then use it to allocate builtin instructions' CU Meters for VM and cost tracking for leaders.
Summary of Changes
Feature Gate Issue: #2562