Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix - FailedVerification and Closed tombstones #419

Merged
merged 8 commits into from
Apr 5, 2024

Conversation

Lichtso
Copy link

@Lichtso Lichtso commented Mar 25, 2024

Problems

  1. TransactionBatchProcessor::load_program_with_pubkey() returns LoadedProgramType::FailedVerification even for accounts which never got to the verifier (because they are empty or buffers for example) which should be marked as LoadedProgramType::Closed instead. LoadedProgramType::FailedVerification should only be used if there is a chance a program gets valid again in the recompilation phase when the feature set and environment changes.

  2. In the tests we currently check that loading non existent accounts results in LoadedProgramType::Closed entries. This however can not happen in integration because we only attempt to load accounts which have a loader as a owner, thus these account must exist. The situation is similar to when we tested LoadedProgramType::FailedVerification existing in the global cache (which can't happen either) and it led to confusion and wasted efforts.

Summary of Changes

  • Allow the transition from LoadedProgramType::Closed to loaded programs in the same slot.
  • Replace ProgramAccountLoadResult::AccountNotFound by None
  • Have TransactionBatchProcessor::load_program_accounts() only return ProgramAccountLoadResult::InvalidAccountData in other error cases
  • Have TransactionBatchProcessor::load_program_with_pubkey() return LoadedProgramType::Closed when TransactionBatchProcessor::load_program_accounts() returned ProgramAccountLoadResult::InvalidAccountData
  • Panic in replenish_program_cache() when attempting to load a nonexistent account, and add a test for that
  • Allowed ProgramCache::assign_program() transition from LoadedProgramType::Closed to loaded programs in the same slot (reverted in Fix - ProgramCache::assign_program() Closed => Loaded in same slot #673).

test_bpf_loader_upgradeable_deploy_with_max_len():

  • Replace mock_process_instruction() by a proper message send
  • Tests the invocation of an empty account before a program is deployed there
  • Tests the invocation of a buffer account before it is used to deploy a program

@Lichtso Lichtso added the v1.18 label Mar 25, 2024
Copy link

mergify bot commented Mar 25, 2024

Backports to the beta branch are to be avoided unless absolutely necessary for fixing bugs, security issues, and perf regressions. Changes intended for backport should be structured such that a minimum effective diff can be committed separately from any refactoring, plumbing, cleanup, etc that are not strictly necessary to achieve the goal. Any of the latter should go only into master and ride the normal stabilization schedule. Exceptions include CI/metrics changes, CLI improvements and documentation updates on a case by case basis.

@Lichtso Lichtso force-pushed the fix/failed_verification_and_closed branch 3 times, most recently from f2990ee to 84b1a11 Compare March 27, 2024 15:34
@Lichtso Lichtso requested review from pgarg66 and alessandrod March 27, 2024 17:42
@Lichtso Lichtso force-pushed the fix/failed_verification_and_closed branch 4 times, most recently from 71a38b3 to ded3c25 Compare March 29, 2024 12:43
@t-nelson
Copy link

t-nelson commented Apr 2, 2024

is there something hold review here? landing this is blocking 1.18 to mb

@alessandrod
Copy link

is there something hold review here? landing this is blocking 1.18 to mb

I did a live (on slack) review of this with Alex last week. I'll do another round tomorrow.

Copy link

@alessandrod alessandrod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This is so much easier to reason about with the Option return.

I just left a few nits otherwise looks good. The major thing is we don't test the new transitions from Closed => XXX - is this what you said you want to do in a follow PR that removes TestLoaded? And if so, we're backporting that too to 1.18 right? I'm not too comfortable adding new transitions without tests.

@@ -522,7 +522,7 @@ pub fn program(ledger_path: &Path, matches: &ArgMatches<'_>) {
let mut loaded_programs =
bank.new_program_cache_for_tx_batch_for_slot(bank.slot() + DELAY_VISIBILITY_SLOT_OFFSET);
for key in cached_account_keys {
loaded_programs.replenish(key, bank.load_program(&key, false, bank.epoch()));
loaded_programs.replenish(key, bank.load_program(&key, false, bank.epoch()).unwrap());

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this should be expect() instead of unwrap

FailedVerification(ProgramRuntimeEnvironment),
/// Tombstone for programs which were explicitly undeployed / closed.
/// Tombstone for accounts which are not programs but might still be owned by a loader.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment now is a bit confusing. The variant is still conceptually mostly about closed
programs, but also - temporarily - used for things like buffer programs. Something like:

/// Tombstone for programs that were explicitly closed. It's also used for accounts belonging to program loaders, that don't actually contain program code (for example buffer accounts for BpfLoaderUpgradeable programs).

Copy link
Author

@Lichtso Lichtso Apr 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

temporarily? You know there is nothing more permanent than a "temporary" solution.

@@ -780,11 +778,15 @@ impl<FG: ForkGraph> ProgramCache<FG> {
let existing = slot_versions.get_mut(index).unwrap();
match (&existing.program, &entry.program) {
(LoadedProgramType::Builtin(_), LoadedProgramType::Builtin(_))
| (LoadedProgramType::Closed, LoadedProgramType::LegacyV0(_))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find all other transitions pretty straightforward, but this Closed => xxx
is not obvious. It would be good to add a comment that explains when it happens
(on deploy) and why

@@ -1742,6 +1746,10 @@ mod tests {
);
}

#[test_case(
LoadedProgramType::Closed,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a FIXME here that says we need to test the missing Closed => Legacy cases?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI does not like FIXME or TODO

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

worst CI ever

@@ -386,82 +385,77 @@ impl<FG: ForkGraph> TransactionBatchProcessor<FG> {
pubkey: &Pubkey,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rustdocs for this method are 100% off? Could you update to say something like

Loads the program with the given pubkey.

If the account doesn't exist it returns `None`. If the account does exist, it must be a program
account (belong to one of the program loaders). Returns Some(InvalidAccountData) if the program
account is Closed, contains invalid data or any of the ancillary program accounts contain invalid
data.

Some(account) => account,
};
) -> Option<ProgramAccountLoadResult> {
let program_account = callbacks.get_account_shared_data(pubkey)?;

debug_assert!(solana_bpf_loader_program::check_loader_id(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a standalone SVM API perspective, I'm undecided on whether it makes sense to
keep this or just return None


let loaded_program = LoadedProgram::new_tombstone(0, LoadedProgramType::Closed);
assert_eq!(result, Arc::new(loaded_program));
assert!(result.is_none());

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The diff in this test 😍

@Lichtso Lichtso force-pushed the fix/failed_verification_and_closed branch from 633963e to 189b852 Compare April 4, 2024 13:46
Copy link

@alessandrod alessandrod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved, tests for Closed => XX to come in followup PR

@Lichtso Lichtso force-pushed the fix/failed_verification_and_closed branch from 189b852 to fd971ad Compare April 5, 2024 09:08
@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 81.8%. Comparing base (562254e) to head (fd971ad).

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #419   +/-   ##
=======================================
  Coverage    81.8%    81.8%           
=======================================
  Files         851      851           
  Lines      230155   230165   +10     
=======================================
+ Hits       188443   188492   +49     
+ Misses      41712    41673   -39     

@Lichtso Lichtso merged commit 55c05c5 into master Apr 5, 2024
38 checks passed
@Lichtso Lichtso deleted the fix/failed_verification_and_closed branch April 5, 2024 11:03
mergify bot pushed a commit that referenced this pull request Apr 5, 2024
* Only the verifier can cause FailedVerification, everything else is Closed

* Removes the environments parameter from load_program_accounts().

* cargo fmt

* Simplify invocation of deployed program

* Attempt to invoke a program before it is deployed

* Attempt to invoke a buffer before it is used in a deployment

* Escalates Option return value of load_program_accounts() to load_program_with_pubkey().

* Review feedback

(cherry picked from commit 55c05c5)

# Conflicts:
#	ledger-tool/src/program.rs
#	program-runtime/src/loaded_programs.rs
#	runtime/src/bank.rs
#	runtime/src/bank/tests.rs
#	svm/src/transaction_processor.rs
Lichtso added a commit that referenced this pull request Apr 5, 2024
* Only the verifier can cause FailedVerification, everything else is Closed

* Removes the environments parameter from load_program_accounts().

* cargo fmt

* Simplify invocation of deployed program

* Attempt to invoke a program before it is deployed

* Attempt to invoke a buffer before it is used in a deployment

* Escalates Option return value of load_program_accounts() to load_program_with_pubkey().

* Review feedback

(cherry picked from commit 55c05c5)
Lichtso added a commit that referenced this pull request Apr 8, 2024
* Only the verifier can cause FailedVerification, everything else is Closed

* Removes the environments parameter from load_program_accounts().

* cargo fmt

* Simplify invocation of deployed program

* Attempt to invoke a program before it is deployed

* Attempt to invoke a buffer before it is used in a deployment

* Escalates Option return value of load_program_accounts() to load_program_with_pubkey().

* Review feedback

(cherry picked from commit 55c05c5)
Lichtso added a commit that referenced this pull request Apr 9, 2024
* Only the verifier can cause FailedVerification, everything else is Closed

* Removes the environments parameter from load_program_accounts().

* cargo fmt

* Simplify invocation of deployed program

* Attempt to invoke a program before it is deployed

* Attempt to invoke a buffer before it is used in a deployment

* Escalates Option return value of load_program_accounts() to load_program_with_pubkey().

* Review feedback

(cherry picked from commit 55c05c5)
Lichtso added a commit that referenced this pull request Apr 9, 2024
* Only the verifier can cause FailedVerification, everything else is Closed

* Removes the environments parameter from load_program_accounts().

* cargo fmt

* Simplify invocation of deployed program

* Attempt to invoke a program before it is deployed

* Attempt to invoke a buffer before it is used in a deployment

* Escalates Option return value of load_program_accounts() to load_program_with_pubkey().

* Review feedback

(cherry picked from commit 55c05c5)
Lichtso added a commit that referenced this pull request Apr 11, 2024
…#419) (#605)

Fix - `FailedVerification` and `Closed` tombstones (#419)

* Only the verifier can cause FailedVerification, everything else is Closed

* Removes the environments parameter from load_program_accounts().

* cargo fmt

* Simplify invocation of deployed program

* Attempt to invoke a program before it is deployed

* Attempt to invoke a buffer before it is used in a deployment

* Escalates Option return value of load_program_accounts() to load_program_with_pubkey().

* Review feedback

(cherry picked from commit 55c05c5)

Co-authored-by: Alexander Meißner <[email protected]>
anwayde pushed a commit to firedancer-io/agave that referenced this pull request Jul 23, 2024
…anza-xyz#419) (anza-xyz#605)

Fix - `FailedVerification` and `Closed` tombstones (anza-xyz#419)

* Only the verifier can cause FailedVerification, everything else is Closed

* Removes the environments parameter from load_program_accounts().

* cargo fmt

* Simplify invocation of deployed program

* Attempt to invoke a program before it is deployed

* Attempt to invoke a buffer before it is used in a deployment

* Escalates Option return value of load_program_accounts() to load_program_with_pubkey().

* Review feedback

(cherry picked from commit 55c05c5)

Co-authored-by: Alexander Meißner <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants