Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix reserve minimal compute units for builtins #3799

Conversation

tao-stones
Copy link

@tao-stones tao-stones commented Nov 26, 2024

Problem

Implementing solana-foundation/solana-improvement-documents#170 by defining MAX_BUILTIN_ALLOCATION_COMPUTE_UNIT_LIMIT to 3K CUs, then use it to allocate builtin instructions' CU Meters for VM and cost tracking for leaders.

Summary of Changes

  • When calculates default tx cu limits, use MAX_BUILTIN_ALLOCATION_COMPUTE_UNIT_LIMIT per builtin instruction, including compute-budget program instructions.
  • Cost model reads cu limits from RuntimeTransaction's static_meta, replacing a localized implementation that isn't consistent with compute-budget.
  • Changes are behind Feature gate
  • updated existing tests to allow additional feature_set parameters (touch many files)

Feature Gate Issue: #2562

@tao-stones tao-stones added the feature-gate Pull Request adds or modifies a runtime feature gate label Nov 26, 2024
@tao-stones tao-stones force-pushed the fix-reserve-minimal-cus-for-builtins-less-api-change branch from e416a8f to 6bbd650 Compare November 26, 2024 18:27
@tao-stones tao-stones force-pushed the fix-reserve-minimal-cus-for-builtins-less-api-change branch 2 times, most recently from 77381a1 to 8faee07 Compare November 27, 2024 20:18
@tao-stones tao-stones marked this pull request as ready for review November 27, 2024 20:18
@tao-stones tao-stones requested a review from a team as a code owner November 27, 2024 20:18
@tao-stones tao-stones force-pushed the fix-reserve-minimal-cus-for-builtins-less-api-change branch from 8faee07 to c181481 Compare November 27, 2024 20:20
@tao-stones tao-stones force-pushed the fix-reserve-minimal-cus-for-builtins-less-api-change branch 4 times, most recently from 0e8a348 to 9192989 Compare December 1, 2024 15:27
core/src/banking_stage/immutable_deserialized_packet.rs Outdated Show resolved Hide resolved
core/src/banking_stage/immutable_deserialized_packet.rs Outdated Show resolved Hide resolved
compute_budget_instruction_details.num_non_compute_budget_instructions,
1
);
match filter.get_program_kind(instruction.program_id_index as usize, program_id) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We really don't need to do all this builtin program accounting unless the compute limit isn't specified by the transaction. How about we move this all into calculate_default_compute_unit_limit and re-iterate through tx instructions there?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you agree with moving this to calculate_default_compute_unit_limit, I've implemented the pre-requisite PR (#3853) to unblock adding an instructions iterator param to sanitize_and_convert_to_compute_budget_limits

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm leaning towards this instead of storing a Vec. Re-iterating is likely less costly than allocation, and we can remove the re-iteration once the feature is activated.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We really don't need to do all this builtin program accounting unless the compute limit isn't specified by the transaction.

This is great observation. Another possibility is to re-iterate in try_from when necessary. Something like this:

struct builtin_instruction_details {
            num_builtin_instructions: 0,
            num_non_builtin_instructions: 0,
            migrating_builtin: vec![0; MIGRATION_FEATURES_ID.len()],
}

fn try_from( ixs: iter ) -> Result<Self> {
    // original iteration: 
    // to count `num_compute_budget_instructions` and process_compute_budget_instruction if present

    // possible second iter
    if requested_compute_unit_limit.is_none() {
        // iterate again to create "builtin_instruction_details"
    }
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, based on benchmarking, the cost of allocating vec with fixed size of 3 is much lower than the cost of hashing Pubkey to lookup at BUILTIN_INSTRUCTION_COSTS. Re-iterate within try_from will reuse cached ComputeBudgetProgramIdFilter, avoiding re-hashing.

Copy link
Author

@tao-stones tao-stones Dec 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wdyt something like this: tao-stones@86fcaeb (edited)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wdyt something like this: tao-stones@86fcaeb (edited)

That does fix the issue where we were unnecessarily calculating builtin counts even when the compute unit limit was specified by the transaction. But I think this implementation is too convoluted. I prefer the code to be simpler, especially if we are planning to backport this change.

Also, based on benchmarking, the cost of allocating vec with fixed size of 3 is much lower than the cost of hashing Pubkey to lookup at BUILTIN_INSTRUCTION_COSTS. Re-iterate within try_from will reuse cached ComputeBudgetProgramIdFilter, avoiding re-hashing.

I don't think we can purely rely on benchmarks for making this decision. The extra allocation is far more more expensive while the validator is running than in an isolated benchmark.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That does fix the issue where we were unnecessarily calculating builtin counts even when the compute unit limit was specified by the transaction. But I think this implementation is too convoluted. I prefer the code to be simpler, especially if we are planning to backport this change.

One thing that didn't make sense about your commit is that we are still calculating num_non_compute_budget_instructions. Similar to the builtin_instruction_details field you added, it only needs to be calculated if the tx didn't specify an explicit compute unit limit.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a note that num_non_compute_budget_instructions is byproduct of first iteration when checking is_compute_budget_program(), so in case cu-limit is not requested and feature is not activated (eg. current behavior), it's ready to calc default cu limit by num_non_compute_budget_instructions * 200K.

if feature_set.is_active(&feature_set::reserve_minimal_cus_for_builtin_instructions::id()) {
// evaluate MigratingBuiltin
let (additional_num_non_builtin_instructions, additional_num_builtin_instructions) =
self.migrating_builtin.iter().enumerate().fold(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally would prefer that we re-iterate over instructions to count builtins rather than increasing the amount of memory used per transaction with all of the new accounting related fields you've added to ComputeBudgetInstructionDetails

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At one point, we found iteration is relatively costly, so opt to trade memory (additional counters) with single-iteration. Now with #3853 merged, and need of handling migration (now or in near future), I can see re-iter transaction's instructions here (after checked cu-limit not requested, and reserve_minimal_cus_for_builtin_instructions is active) makes sense. Looks like @apfitzge is also onboard. I'll prep a PR for that.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think it depends on how we store the additional mem. If an extra u16 or 2 per transaction - that may be worth avoiding re-iteration, but we should probably measure. I think as features get activated we may be able to remove both the additional mem or re-iteration.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#3899 is the re-iteration version of implementation, it does not support builtin migration, but I don't think adding that will be a big lift. Need to have tests updated to supply iter to sanitize_and_convert_to_compute_budget_limits(), after that I'll put a separate commit for migration to compare.

@tao-stones tao-stones force-pushed the fix-reserve-minimal-cus-for-builtins-less-api-change branch from 9192989 to 96da7c6 Compare December 2, 2024 15:10
@jstarry
Copy link

jstarry commented Dec 3, 2024

I'm now thinking it was a mistake to have this SIMD-0170 implementation PR also handle the builtin program migration case. How about we push that off for now since those migration feature gates aren't scheduled yet? Then this PR can just focus on making the cost tracker and compute limit agree on 3k cu's per builtin. The extra allocation for builtin migration tracking can be figured out later since we don't need to backport that part.

@tao-stones tao-stones force-pushed the fix-reserve-minimal-cus-for-builtins-less-api-change branch from 96da7c6 to c2f7271 Compare December 4, 2024 04:43
Copy link
Author

@tao-stones tao-stones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jstarry @apfitzge cleaned up the PR, touched a lot of file mostly due to adding feature_set or add Clone to iterator. Can you take another look?

@@ -44,10 +50,36 @@ impl ComputeBudgetInstructionDetails {
}
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only when Compute Unite Limit is not requested to re-iterate instructions to collect builtin program counts. This bit of optimization is because is_compute_budget_program() is cheap, get_program_kind() on the other hand does additional HashMap lookup by Pubkey; Saving unnecessary calls to get_program_kind() perhaps justifies additional iteration.

@tao-stones tao-stones force-pushed the fix-reserve-minimal-cus-for-builtins-less-api-change branch from c2f7271 to 7251345 Compare December 4, 2024 05:27
jstarry
jstarry previously approved these changes Dec 4, 2024
Copy link

@jstarry jstarry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with the PR as is but added some nits

solana_sdk::compute_budget::check_id(program_id)
}

#[inline]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: shouldn't we only be using inline for one-liners for the most part?


if is_builtin_program(program_id) {
ProgramKind::Builtin {
is_compute_budget: solana_sdk::compute_budget::check_id(program_id),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: technically we would know that this isn't the compute budget program already because otherwise the filter would have had a compute budget builtin entry that was populated by the first filter pass

@tao-stones tao-stones force-pushed the fix-reserve-minimal-cus-for-builtins-less-api-change branch from 36a2346 to 2c7a7cf Compare December 4, 2024 16:34
Comment on lines 761 to +763
let fee_budget_limits = FeeBudgetLimits::from(process_compute_budget_instructions(
message.program_instructions_iter(),
&bank.feature_set,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Split out #3922 - we can use cached value here.

Will wait on merge/review of that until this PR is merged.

Copy link

@apfitzge apfitzge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm - please wait on @jstarry's confirmation as well

Copy link

@pgarg66 pgarg66 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving for @anza-xyz/svm

@jstarry jstarry mentioned this pull request Dec 5, 2024
@tao-stones tao-stones merged commit 3e9af14 into anza-xyz:master Dec 5, 2024
50 checks passed
@tao-stones tao-stones added v2.0 Backport to v2.0 branch v2.1 Backport to v2.1 branch labels Dec 5, 2024
Copy link

mergify bot commented Dec 5, 2024

Backports to the stable branch are to be avoided unless absolutely necessary for fixing bugs, security issues, and perf regressions. Changes intended for backport should be structured such that a minimum effective diff can be committed separately from any refactoring, plumbing, cleanup, etc that are not strictly necessary to achieve the goal. Any of the latter should go only into master and ride the normal stabilization schedule.

Copy link

mergify bot commented Dec 5, 2024

Backports to the beta branch are to be avoided unless absolutely necessary for fixing bugs, security issues, and perf regressions. Changes intended for backport should be structured such that a minimum effective diff can be committed separately from any refactoring, plumbing, cleanup, etc that are not strictly necessary to achieve the goal. Any of the latter should go only into master and ride the normal stabilization schedule. Exceptions include CI/metrics changes, CLI improvements and documentation updates on a case by case basis.

mergify bot pushed a commit that referenced this pull request Dec 5, 2024
- Add feature gate, issue #2562;
- Implement SIMD-170;

---------

Co-authored-by: Justin Starry <[email protected]>
(cherry picked from commit 3e9af14)

# Conflicts:
#	builtins-default-costs/src/lib.rs
#	compute-budget/src/compute_budget_limits.rs
#	compute-budget/src/compute_budget_processor.rs
#	core/src/banking_stage/consumer.rs
#	core/src/banking_stage/immutable_deserialized_packet.rs
#	core/src/banking_stage/transaction_scheduler/receive_and_buffer.rs
#	cost-model/src/cost_model.rs
#	cost-model/src/transaction_cost.rs
#	programs/compute-budget-bench/benches/compute_budget.rs
#	programs/sbf/tests/programs.rs
#	runtime-transaction/benches/process_compute_budget_instructions.rs
#	runtime-transaction/src/compute_budget_instruction_details.rs
#	runtime-transaction/src/compute_budget_program_id_filter.rs
#	runtime-transaction/src/lib.rs
#	runtime-transaction/src/runtime_transaction.rs
#	runtime-transaction/src/runtime_transaction/sdk_transactions.rs
#	runtime/src/bank.rs
#	runtime/src/bank/tests.rs
#	runtime/src/prioritization_fee_cache.rs
#	sdk/src/feature_set.rs
#	svm-transaction/src/svm_message.rs
#	svm-transaction/src/svm_message/sanitized_message.rs
#	svm-transaction/src/svm_message/sanitized_transaction.rs
#	svm/src/transaction_processor.rs
#	transaction-view/src/resolved_transaction_view.rs
#	transaction-view/src/transaction_view.rs
mergify bot pushed a commit that referenced this pull request Dec 5, 2024
- Add feature gate, issue #2562;
- Implement SIMD-170;

---------

Co-authored-by: Justin Starry <[email protected]>
(cherry picked from commit 3e9af14)

# Conflicts:
#	builtins-default-costs/src/lib.rs
#	core/src/banking_stage/consumer.rs
#	core/src/banking_stage/immutable_deserialized_packet.rs
#	core/src/banking_stage/transaction_scheduler/receive_and_buffer.rs
#	cost-model/src/cost_model.rs
#	programs/compute-budget-bench/benches/compute_budget.rs
#	runtime-transaction/src/lib.rs
#	runtime-transaction/src/runtime_transaction/sdk_transactions.rs
#	runtime/src/prioritization_fee_cache.rs
@tao-stones tao-stones deleted the fix-reserve-minimal-cus-for-builtins-less-api-change branch December 5, 2024 15:06
tao-stones added a commit that referenced this pull request Dec 11, 2024
- Add feature gate, issue #2562;
- Implement SIMD-170;

---------

Co-authored-by: Justin Starry <[email protected]>
(cherry picked from commit 3e9af14)

# Conflicts:
#	builtins-default-costs/src/lib.rs
#	core/src/banking_stage/consumer.rs
#	core/src/banking_stage/immutable_deserialized_packet.rs
#	core/src/banking_stage/transaction_scheduler/receive_and_buffer.rs
#	cost-model/src/cost_model.rs
#	programs/compute-budget-bench/benches/compute_budget.rs
#	runtime-transaction/src/lib.rs
#	runtime-transaction/src/runtime_transaction/sdk_transactions.rs
#	runtime/src/prioritization_fee_cache.rs
@t-nelson
Copy link

chat, please merge the SIMDs before the implementations

tao-stones added a commit that referenced this pull request Dec 11, 2024
- Add feature gate, issue #2562;
- Implement SIMD-170;

---------

Co-authored-by: Justin Starry <[email protected]>
(cherry picked from commit 3e9af14)

# Conflicts:
#	builtins-default-costs/src/lib.rs
#	core/src/banking_stage/consumer.rs
#	core/src/banking_stage/immutable_deserialized_packet.rs
#	core/src/banking_stage/transaction_scheduler/receive_and_buffer.rs
#	cost-model/src/cost_model.rs
#	programs/compute-budget-bench/benches/compute_budget.rs
#	runtime-transaction/src/lib.rs
#	runtime-transaction/src/runtime_transaction/sdk_transactions.rs
#	runtime/src/prioritization_fee_cache.rs
tao-stones added a commit that referenced this pull request Dec 13, 2024
- Add feature gate, issue #2562;
- Implement SIMD-170;

---------

Co-authored-by: Justin Starry <[email protected]>
(cherry picked from commit 3e9af14)

# Conflicts:
#	builtins-default-costs/src/lib.rs
#	core/src/banking_stage/consumer.rs
#	core/src/banking_stage/immutable_deserialized_packet.rs
#	core/src/banking_stage/transaction_scheduler/receive_and_buffer.rs
#	cost-model/src/cost_model.rs
#	programs/compute-budget-bench/benches/compute_budget.rs
#	runtime-transaction/src/lib.rs
#	runtime-transaction/src/runtime_transaction/sdk_transactions.rs
#	runtime/src/prioritization_fee_cache.rs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-gate Pull Request adds or modifies a runtime feature gate v2.0 Backport to v2.0 branch v2.1 Backport to v2.1 branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants