-
Notifications
You must be signed in to change notification settings - Fork 300
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
svm: allow conflicting transactions in entries #3453
svm: allow conflicting transactions in entries #3453
Conversation
e5f77c3
to
79e6c3c
Compare
f9d86a9
to
1bf6a38
Compare
if let Some(program) = (use_program_cache && is_invisible_read) | ||
.then_some(()) | ||
.and_then(|_| self.program_cache.find(account_key)) | ||
{ | ||
// Optimization to skip loading of accounts which are only used as | ||
// programs in top-level instructions and not passed as instruction accounts. | ||
Some(LoadedTransactionAccount { | ||
return Some(LoadedTransactionAccount { | ||
loaded_size: program.account_size, | ||
account: account_shared_data_from_program(account_key, &self.program_accounts) | ||
.ok()?, | ||
rent_collected: 0, | ||
}) | ||
}); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i made this a separate block because once disable_account_loader_special_case
activates it is simply deleted. we will also be able to remove program_cache
and program_accounts
from AccountLoader
and they become free-floating variables in transaction_processor.rs
again
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also an idea i have been toying with after we do that is to simply implement TransactionProcessingCallback
for AccountLoader
itself so we can pass it to program cache setup, which will happen after account loading. this depends on how feasible it is to restrict callbacks
to a concrete type (unless we just make two traits)
svm/src/account_loader.rs
Outdated
// If lamports is 0, a previous transaction deallocated this account. | ||
// Return None without inspecting, so it can be recreated. | ||
if cache_item.account.lamports() == 0 { | ||
None | ||
} else { | ||
Some(cache_item.account.clone()) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while i was getting this pr ready, i realized we dont need to replace zero lamport accounts in the account cache with AccountSharedData::default()
at all, so that saves us some Arc::new()
calls. we just need something in there that stops us from going to accounts-db, and we can replace it when reading. account_cache
is private and load_account()
is the only function that reads from it, so we dont need to handle them special anywhere else
svm/src/account_loader.rs
Outdated
self.account_cache.insert( | ||
*account_key, | ||
AccountCacheItem { | ||
account: account.clone(), | ||
inspected_as_writable: is_writable, | ||
}, | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i left the cache insert unguarded by disable_account_loader_special_case
which means it will be used by 2.2 the moment this pr lands. this is safe (@jstarry should confirm):
- there is no change to pre-simd83 consensus if
disable_account_loader_special_case
is not active, because the account cache comes after the program cache, so behavior is identical to if these accounts came from accounts-db disable_account_loader_special_case
is in 2.0, so it is almost guaranteed to activate before the simd83 consensus gate which will relax account locks. but if, for some startling reason, simd83 consensus activates first, it is still fine. there is an edge case where the program cache would shadow the account cache if you close and reopen a loaderv3 buffer. but this only applies if the account is loaded invisible, so it only affects transaction data size, and this is fine because simd83 consensus necessarily can only be activated by its own feature gate
svm/src/account_loader.rs
Outdated
if let Some(slot_history) = | ||
account_overrides.and_then(|overrides| overrides.get(&slot_history::id())) | ||
{ | ||
// We set `inspected_as_writable` to `true` because this never needs to be inspected. | ||
account_cache.insert( | ||
slot_history::id(), | ||
AccountCacheItem { | ||
account: slot_history.clone(), | ||
inspected_as_writable: true, | ||
}, | ||
); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after this pr lands, we can contemplate a more restricted interface for passing only this account through. maybe slot_history: Option<AccountSharedData>
on the transaction processing environment
bit of an aside, but the sol_get_sysvar
syscall we added in 2.0 for core bpf program conversion does not support SlotHistory
by design, we only needed StakeHistory
and SlotHashes
. so the good news is it doesnt introduce a mechanism to bypass this override
the possibly better news is... if neither me nor joe needed to add SlotHistory
, do any important bpf programs or builtins actually use it? i have no idea. maybe we dont need to let programs see this sysvar at all if its only used by the validator itself. if so, we could add some #[deprecated]
where needed and specialcase it like the instructions sysvar in load_transaction_account
to return an account with no data (would require a feature gate, perhaps we could avoid a simd)
account_data.set_rent_epoch(u64::MAX); | ||
account_data.set_lamports(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the test account is read-only, but this is now necessary because svm would put it in the account cache the first time it loads it in an entry and then assume it had been deallocated the second time it loads it. the svm behavior is correct, because there should be no way to bring an account to zero lamports without it being writable, therefore we should never be able to see one in a read-only context in the first place. our builtins and sysvars have dust lamports
svm/src/account_loader.rs
Outdated
slot_history::id(), | ||
AccountCacheItem { | ||
account: slot_history.clone(), | ||
inspected_as_writable: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought there was a discussion on previous PR about this inspected_as_writable
stuff.
The function that is called to inspect accounts already has guards against doing it twice.
Is this still necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we never got confirmation on that. ill check with brooks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alright i have clarity on this now, the thing i was missing is that the idea svm has of "what things to inspect" and the idea bank has of "what results of inspection do i actually care about" are disjoint concepts. we can get rid of the state tracking and inspect every time we get an account from accounts-db or account cache. bank has its own dedupe logic and other implementers of inspect can do whatever they want
1bf6a38
to
96ef843
Compare
removed the we can get rid of |
// Determine a capacity for the internal account cache. This | ||
// over-allocates but avoids ever reallocating, and spares us from | ||
// deduplicating the account keys lists. | ||
let account_keys_in_batch = sanitized_txs.iter().map(|tx| tx.account_keys().len()).sum(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid allocating every time, we can change to use thread local variable instead, wdyt.
example:
thread_local! {
static HAS_DUPLICATES_SET: RefCell<AHashSet<Pubkey>> = RefCell::new(AHashSet::with_capacity(MAX_TX_ACCOUNT_LOCKS));
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this code is just getting a reasonable upper bound on account cache size. the mechanism in your pr to prevent duplicate transactions in svm isnt required because svm has no concept of duplicate transactions. my understanding is theyre presently implicitly deduped at the account locks level by relying on the fee-payer lock and we just need to do this explicitly with message hashes instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not suggesting here to de-duplicate transactions within a batch. The purpose of this example is to point out a thread_local variable use case to avoid the allocation over-head for account cache in every batch.
Regarding the de-duplication of transactions, when conflicting batches will be allowed the account locking function won't implicitly de-duplicate such transactions. Then as you said we need to explicitly check a batch does not have any duplicate transactions using message hashes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, ty for the suggestion but we dont want to reuse the cache between batches
Just for clarification, according to the previous conversation the second PR containing the account cache was envisioned to be feature-gated. |
thats correct, the decision to feature gate svm was made because the necessary logic to first go to the account cache and then possibly fall back to the program cache was extremely brittle and had an enormous number of edge cases. but for unrelated reasons we decided to remove (via feature gate) the program cache from account loading entirely and backported it to 2.0, so we were able to simplify the logic here to be much more sound |
Sounds good, so if I understand correctly, the |
if let Some(slot_history) = | ||
account_overrides.and_then(|overrides| overrides.get(&slot_history::id())) | ||
{ | ||
account_cache.insert(slot_history::id(), slot_history.clone()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add a debug assert that the size of account_overrides
is equal to 1
if sysvar is present and 0
otherwise? Or is it basically impossible now that we decreased the vis of the account overrides set_account
function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it isnt possible, the only exposed setter is set_slot_history
now (which accepts AccountSharedData
and hardcodes the Pubkey
as sysvar::slot_history::id()
. we can remove AccountOverrides
entirely now as a cleanup. im thinking ill look into it when i start on simd186 since id like to change something about simulation results, and this is related to simulation only
self.callbacks | ||
.inspect_account(account_key, AccountState::Alive(account), is_writable); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we check if lamports is 0 before saying the account is alive? Otherwise you could read a dead account into the cache and later write to it and even tho it's still dead, we say it's alive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
youre right, ill open a pr for this on monday. thankfully it doesnt affect bank because to get here we would have already had to inspect as writable (which is why this was missed; before removing AccountCacheItem
, inspecting a previously-alive now-dead account never happened at all)
edit: actually there is a slight edge case of taking a dead account read-only, leaving it dead, and making it alive, but this is only possible post-simd83 and the fix is the same
edit2: nevermind there actually is no way to get a bad first inspect because to bring the account to zero lamports in the first place it must necessarily be inspected as writable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm isn't there a bug post simd-83 where you read a dead account in one tx and then write to it in another? The inspect for the write tx would say it's alive when it's actually dead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, i made my first edit thinking of that, but made the second edit thinking of the reason why we didnt need to check for equality with AccountSharedData::default()
. but in this case the account doesnt need to exist
(i have a pr for this now but have been blocked by ci build issues, will add you once its green)
Problem
simd83 intends to relax entry-level constraints, namely that a transaction which takes a write lock on an account cannot be batched with any other transaction which takes a read or write lock on it
before the locking code can be changed, however, svm must be able to support such batches. presently, if one transaction were to write to an account, and a subsequent transaction tried to read from or write to that account, svm would not pass the updated account state to the second transaction; it gets them from accounts-db, which is not updated until after all transaction execution finishes
Summary of Changes
in #3404, we refactored transaction processing to perform all steps for each transaction in sequence, and added the
AccountLoader
type to enable the future addition of an intermediate account cache for changed states. this pr adds that cacheall code changes are in the first commit, and the commit is delightfully small
im still minorly displeased at how much code we have to copy from account saver
collect_accounts_to_store
because those functions also need the pre-execution transactions, the post-execution transactions, and accumulate multiple vectors of data. maybe we could make the logic inupdate_accounts_for_executed_tx
et al runtime utility functions and have account saver do two passes through everything, but i think this kind of refactor is out of scope hereall other commits are changes to tests. the vast majority of new lines of code in this pr are new integration tests, but they are basically unchanged from #3146. the only important change to the integration tests are making the framework emulate how zero lamport accounts are deallocated. we dont reuse the code on
AccountLoader
to guard against refactors introducing bugs in its own logic