-
Notifications
You must be signed in to change notification settings - Fork 261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
clean scan optimization: scan disk index only for zero lamport #2879
Conversation
5f2c5ff
to
f557e79
Compare
f557e79
to
5cda868
Compare
Co-authored-by: Brooks <[email protected]>
Co-authored-by: Brooks <[email protected]>
Co-authored-by: Brooks <[email protected]>
Co-authored-by: Brooks <[email protected]>
.accounts | ||
.scan_pubkeys(|pubkey| self.insert_pubkey(&candidates, *pubkey)); | ||
|
||
store.accounts.scan_index(|index| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I missed this the first time. We're using scan_index()
now, instead of scan_pubkeys()
. I think this is fine. With AppendVecs this is basically no additional cost. With Tiered Storage, it shouldn't be much additional work to get the is-zero-lamports information (esp once tiered storage has the lamports optimizations).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah. we need to read the accounts meta. but it shouldn't be expensive.
if candidate_info.might_contain_zero_lamport_entry { | ||
ScanFilter::All | ||
} else { | ||
self.scan_filter_for_shrinking |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remind me where self.scan_filter_for_shrinking
is set (and what the value is)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/anza-xyz/agave/blob/master/validator/src/main.rs#L1274-L1285
It is set by the CLI.
default is "all", so no impact if not passed by CLI.
Maybe we should rename the CLI in a future PR, as it is not just for shrinking but also for cleaning.
let is_zero_lamport = index.index_info.lamports == 0; | ||
insert_candidate(pubkey, is_zero_lamport); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we're scanning here, what happens in the case where this index_info is not zero lamport, but there is another index entry for this pubkey that is zero lamport? If we end up with setting might_contain_zero_lamport_entry to false, will that prevent the later scan from looking on disk and finding the other zero lamport entry?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. if it was true, then it will still be true, I think. We are oring is_zero_lamport.
candidates_bin
.entry(pubkey)
.or_default()
.might_contain_zero_lamport_entry |= is_zero_lamport;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I understand if a different entry for the same pubkey sets might_contain_zero_lamport_entry
to true, then other false
s will not reset the field to false.
My question is, when looking at the dirty stores, what if there's a pubkey A
in dirty storage slot 100, with non-zero lamports. Also assume pubkey A
is in non dirty storage slot 7 with zero lamports. It looks to me that we'd only look at storage 100 and not storage 7, so the clean candidate would say "nope, no zero lamport entries here" and then the scan filter would not look on disk for the other index entries.
I think after typing that out, I can see why it won't happen. If we have a dirty storage at slot 100, then the index entry for pubkey A
must be in the in-memory index, as the slot-list will be 2 in this example. So only looking in-mem is safe, and we don't need to look on disk.
Does that sound right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah. I see what you mean.
Yes, you are correct. If there are other instances of the account outside of the stores that we scan, then disk index is irrelevant because it will always be in memory.
i think this is correct. I need to look at this again monday. Do you have a machine monitoring this? Do we have a metric that shows us the difference in clean disk index loads? |
Yes, it is running on dev12 (3gB7) since yesterday. |
@@ -3226,7 +3249,7 @@ impl AccountsDb { | |||
let is_candidate_for_clean = | |||
max_slot_inclusive >= *slot && latest_full_snapshot_slot >= *slot; | |||
if is_candidate_for_clean { | |||
self.insert_pubkey(&candidates, *pubkey); | |||
insert_candidate(*pubkey, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this 'true'?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, because we are iterating zero_lamport_accounts_to_purge_after_full_snapshot
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. exactly!
@HaoranYi can you please show metric comparison for: |
blue: this PR. Yes, missing is much higher with this PR. But on normal machine, missing is small but not close to zero (between 5-6K per minute). |
I think it is not zero because we are adding dead store back to "dirty stores" agave/accounts-db/src/accounts_db.rs Line 8113 in 489f483
And those dead storages are not dropped until next clean starts (This is |
ok, this graph shows the savings. This many disk lookups are avoided to get the same results. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
if candidate_info.might_contain_zero_lamport_entry { | ||
ScanFilter::All | ||
} else { | ||
self.scan_filter_for_shrinking |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note that until we change the default of this cli arg, we'll have no impact on clean (or shrink) due to this filtering scan thing. default is ScanFilter::All
So, once this goes in, we need long term testing of setting scan filter to abnormal only.
This has been running fine on mainnet for a week. So merge it. |
…xyz#2879) * clean scan optimization * fix rebase conflicts * Update accounts-db/src/accounts_db.rs Co-authored-by: Brooks <[email protected]> * Update accounts-db/src/accounts_db.rs Co-authored-by: Brooks <[email protected]> * Update accounts-db/src/accounts_db.rs Co-authored-by: Brooks <[email protected]> * Update accounts-db/src/accounts_db.rs Co-authored-by: Brooks <[email protected]> * review update * revert ZeroLamport trait for IndexInfoInner --------- Co-authored-by: HaoranYi <[email protected]> Co-authored-by: Brooks <[email protected]>
Problem
Generally, we don't need to scan disk index for clean because disk index only contains single ref single entry account index, which is nearly always "alive" and shouldn't be cleaned, except for one special case - single ref zero accounts (single ref zero accounts need to be cleaned).
Currently, we scan disk index for every candidate, which can be optimized to only scan disk index for zeros.
Summary of Changes
optimize clean scan.
Fixes #