Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: deploy program on last slot of epoch during environment change #101

Merged
merged 9 commits into from
Mar 22, 2024

Conversation

pgarg66
Copy link

@pgarg66 pgarg66 commented Mar 6, 2024

Problem

On runtime environment change, the program deployed in the last slot of the prior epoch will use incorrect environment. This will also prevent the loading of the program in future epochs with the correct runtime environment.

Summary of Changes

The problem happens because the effective slot for the deployed program is in the next epoch, where the new environment becomes effective.
This change updates the program deployment procedure, to use the environment of the epoch where the program actually becomes effective.

Fixes #

Copy link

@alessandrod alessandrod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor nits but otherwise looks good to me.

The logic in this PR hinges on the fact that LoadedPrograms::upcoming_environments is always set correctly. That field is set here https://github.com/pgarg66/solana/blob/9044bb8f08a0ade2d1e42ca1ebd9fabf2757506c/runtime/src/bank.rs#L1379. Whenever a bank for any slot within slots_in_recompilation_phase from the epoch end is created, if upcoming_environments isn't already set, it will be set. So even in the case of skipped slots, by the time we get to tx execution for a given bank, upcoming_environments is guaranteed to be set correctly.

program-runtime/src/invoke_context.rs Outdated Show resolved Hide resolved
program-runtime/src/loaded_programs.rs Outdated Show resolved Hide resolved
program-runtime/src/loaded_programs.rs Show resolved Hide resolved
hit_max_limit: false,
}
}

/// Returns the current environments depending on the given epoch
pub fn get_environments_for_epoch(&self, epoch: Epoch) -> &ProgramRuntimeEnvironments {
if epoch != self.latest_root_epoch {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works, but I think > instead of != would make the logic clearer (although I know that in other places we use !=, so feel free to leave != if you prefer to be consistent)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to leave it as is for now. I am not 100% sure why we have != check in the other function. Just want to make sure we don't oversee some actual condition.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I chose == and != over <= and > to avoid off-by-one-errors such as writing < instead of <=. That is all, don't see a semantic reason why we need to stick with ==.

&self,
epoch: Epoch,
) -> Option<ProgramRuntimeEnvironments> {
if epoch == self.latest_root_epoch {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, I'd prefer <=

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to leave it as is for now. I am not 100% sure why we have != check in the other function. Just want to make sure we don't oversee some actual condition.

programs/bpf_loader/src/lib.rs Show resolved Hide resolved
@pgarg66 pgarg66 force-pushed the last-slot-deploy branch 4 times, most recently from 6150a73 to 10549ea Compare March 6, 2024 17:21
@codecov-commenter
Copy link

codecov-commenter commented Mar 6, 2024

Codecov Report

Attention: Patch coverage is 90.10989% with 18 lines in your changes are missing coverage. Please review.

Project coverage is 81.9%. Comparing base (4a67cd4) to head (42bb424).
Report is 2 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff            @@
##           master     #101     +/-   ##
=========================================
- Coverage    81.9%    81.9%   -0.1%     
=========================================
  Files         838      838             
  Lines      226929   227066    +137     
=========================================
+ Hits       185924   186024    +100     
- Misses      41005    41042     +37     

@pgarg66 pgarg66 marked this pull request as ready for review March 6, 2024 19:55
@pgarg66 pgarg66 requested a review from alessandrod March 6, 2024 19:55
@pgarg66
Copy link
Author

pgarg66 commented Mar 6, 2024

@Lichtso FYI.

@pgarg66 pgarg66 force-pushed the last-slot-deploy branch from 239f348 to ff5995a Compare March 9, 2024 00:29
@pgarg66 pgarg66 requested a review from Lichtso March 15, 2024 14:56
@pgarg66
Copy link
Author

pgarg66 commented Mar 19, 2024

@Lichtso, @alessandrod could you help review this?

@@ -239,6 +239,24 @@ macro_rules! register_feature_gated_function {
};
}

pub fn morph_into_deployment_environment_v1(
from: Arc<BuiltinProgram<InvokeContext>>,
) -> Result<BuiltinProgram<InvokeContext>, Error> {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably do this in RBPF instead and have a set_config or a copy constructor there.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to push an update to rBPF crate for this?
We also need to backport it to v1.18. So maybe leave this here for time being, and remove it from master once rBPF change is available?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove it from master in the next RBPF release.

runtime/src/bank.rs Outdated Show resolved Hide resolved
@pgarg66 pgarg66 requested a review from Lichtso March 20, 2024 16:49
@pgarg66 pgarg66 added the v1.18 label Mar 22, 2024
Copy link

mergify bot commented Mar 22, 2024

Backports to the beta branch are to be avoided unless absolutely necessary for fixing bugs, security issues, and perf regressions. Changes intended for backport should be structured such that a minimum effective diff can be committed separately from any refactoring, plumbing, cleanup, etc that are not strictly necessary to achieve the goal. Any of the latter should go only into master and ride the normal stabilization schedule. Exceptions include CI/metrics changes, CLI improvements and documentation updates on a case by case basis.

@pgarg66 pgarg66 merged commit 91b1ee3 into anza-xyz:master Mar 22, 2024
38 checks passed
@pgarg66 pgarg66 deleted the last-slot-deploy branch March 22, 2024 14:43
mergify bot pushed a commit that referenced this pull request Mar 22, 2024
…101)

* Fix: deploy program on last slot of epoch during environment change

* solana-runtime: deploy at last epoch slot test

* disable deployment of sol_alloc_free

* Move tx-batch-constructor to its own function

* use new_from_cache

---------

Co-authored-by: Alessandro Decina <[email protected]>
(cherry picked from commit 91b1ee3)

# Conflicts:
#	ledger-tool/src/program.rs
#	runtime/src/bank/tests.rs
#	svm/src/transaction_processor.rs
pgarg66 added a commit that referenced this pull request Mar 22, 2024
…101)

* Fix: deploy program on last slot of epoch during environment change

* solana-runtime: deploy at last epoch slot test

* disable deployment of sol_alloc_free

* Move tx-batch-constructor to its own function

* use new_from_cache

---------

Co-authored-by: Alessandro Decina <[email protected]>
(cherry picked from commit 91b1ee3)
pgarg66 pushed a commit that referenced this pull request Mar 22, 2024
anwayde pushed a commit to firedancer-io/agave that referenced this pull request Jul 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants