-
Notifications
You must be signed in to change notification settings - Fork 254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Removes the storage recycler #118
Conversation
3857c26
to
2f6bd0c
Compare
2f6bd0c
to
e7fbb6c
Compare
// NOTE: The recycler has been removed. Creating this many extra storages is no longer | ||
// necessary, but also does no harm either. | ||
const MAX_RECYCLE_STORES: usize = 1000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The MAX_RECYCLE_STORES
is now only used in this test. In an effort to create a small PR, I have not updated the test here.
let (_, drop_or_recycle_stores_measure) = measure!( | ||
self.accounts_db() | ||
.drop_or_recycle_stores(dead_storages, &self.accounts_db().shrink_stats), | ||
"drop or recycle stores" | ||
); | ||
info!("{drop_or_recycle_stores_measure}"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The drop_or_recycle_stores()
function has been removed, so we call drop
explicitly now.
@@ -3957,9 +3854,15 @@ impl AccountsDb { | |||
); | |||
} | |||
|
|||
self.drop_or_recycle_stores(dead_storages, stats); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function was removed, so we call drop
directly now.
pub fn drop_or_recycle_stores( | ||
&self, | ||
dead_storages: Vec<Arc<AccountStorageEntry>>, | ||
stats: &ShrinkStats, | ||
) { | ||
let mut recycle_stores_write_elapsed = Measure::start("recycle_stores_write_time"); | ||
let mut recycle_stores = self.recycle_stores.write().unwrap(); | ||
recycle_stores_write_elapsed.stop(); | ||
|
||
let mut drop_storage_entries_elapsed = Measure::start("drop_storage_entries_elapsed"); | ||
if recycle_stores.entry_count() < MAX_RECYCLE_STORES { | ||
recycle_stores.add_entries(dead_storages); | ||
drop(recycle_stores); | ||
} else { | ||
self.stats | ||
.dropped_stores | ||
.fetch_add(dead_storages.len() as u64, Ordering::Relaxed); | ||
drop(recycle_stores); | ||
drop(dead_storages); | ||
} | ||
drop_storage_entries_elapsed.stop(); | ||
stats | ||
.drop_storage_entries_elapsed | ||
.fetch_add(drop_storage_entries_elapsed.as_us(), Ordering::Relaxed); | ||
stats | ||
.recycle_stores_write_elapsed | ||
.fetch_add(recycle_stores_write_elapsed.as_us(), Ordering::Relaxed); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When not recycling, this function boils down to:
- drop the storages
- update stats
We inline this into the one real caller above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -5831,32 +5612,26 @@ impl AccountsDb { | |||
.safety_checks_elapsed | |||
.fetch_add(safety_checks_elapsed.as_us(), Ordering::Relaxed); | |||
|
|||
let mut total_removed_storage_entries = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a duplicate of num_stored_slots_removed
, so we just use that one instead for the stats.
self.stats | ||
.dropped_stores | ||
.fetch_add(num_stored_slots_removed as u64, Ordering::Relaxed); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This stat update used to live in recycle_slot_stores()
(which was removed).
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #118 +/- ##
=========================================
- Coverage 81.8% 81.8% -0.1%
=========================================
Files 838 838
Lines 225947 225700 -247
=========================================
- Hits 184971 184717 -254
- Misses 40976 40983 +7 |
First pass review looks good to me. I'm going to use a finer tooth comb and go over it again and think about this for a while. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
super excited to reduce complexity and get rid of this whole concept. It always attacks suspicion and will simplify the model of how to think about storages.
// Backing mmaps for removed storages entries explicitly dropped here outside | ||
// of any locks | ||
let mut drop_storage_entries_elapsed = Measure::start("drop_storage_entries_elapsed"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: measure!(...) as we did for other places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I could do that. I was trying to keep any extraneous changes to a minimum. Since it's not a correctness issue, I'd prefer to not tackle that change here in this PR though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm.
@sakridge Requesting your review to ensure there's not any other gotchas/corner cases I've missed. Since you added the recycler, I figured your input would be valuable. I'm happy to look up any additional metrics too. |
Ideally all slots are roots or a very high percentage, so I'm not sure how having the write cache only write roots should change much in terms of file creation assuming we still create new files for each slot. I think we do write the whole slot to a single file now instead of potentially multiple files as before, so maybe that would reduce the average number of files created. |
My understanding was that before the write cache was added, append vecs were created for every slot, before rooting. Yes, most would get rooted, but not all. So some append vecs would get dropped sooner because their slot was pruned due to being on a minority fork.
Yes, now we write-all instead of append. If there is only a single account path, then I think the number of storages would be the same. With multiple account path, then yes, before there could be multiple storages per slot. Combined with the part above, dropped slots on a minority fork could drop more storages if there were many account paths. |
I've updated the PR's description and added a bunch of metrics under the "More Metrics" details dropdown. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, from the metrics seems to be not helping much given the save in complexity and the drop overhead. Thanks adding those additional ones.
I'll keep monitoring these metrics going forward to see if there are any regressions. |
Problem
The storage recycler is no longer necessary. It is the cause of this fastboot issue1, and prevents this fastboot usage2.
Summary of Changes
Remove the storage recycler
More information
In the PR that added the recycler3, it mentions two motivations:
Now that we have the accounts write cache, only roots are written to storages. IOW, we don't have storages for slots on a minority fork that got discarded. We also don't "append" to (non-ancient) append vecs anymore. I think this means we won't ever flush a non-root, and we won't have unflushed data in a storage.
I posted graphs of metrics over on Discord4 that showed performance metrics w.r.t. dropping storages and creating storages. There are more metrics at the end of this description too. Tl;dr: occasionally this PR takes a while to create a new mmap file. Mmap deletion seems to be faster with this PR.
I think this is an overall win. The code gets simpler. We unlock a desirable use-case. And we can remove 'atomic' from the AccountStorageEntry
slot
andid
fields. Creating an mmap did get slower, once out of an hour window. This is in the background, and the overall flush/clean/shrink metrics did not seem to be negatively impacted.More Metrics
brux
is this PR, aka recycler OFFdev7
ismaster
(the commit prior to this PR), aka recycler ONThe number of dropped storages, with the values summed. Both nodes appear the same.
Time to create storages, during shrink. Also the sum. (I have 'min' and 'max' graphs too, but they look similar.) Presumably this is when recycler ON should be a big win. We do see that this does spike when the recycler is OFF.
Time to drop storages, during shrink, summed. Basically zero.
Time to drop storages, during clean, summed. With recycler OFF, this is almost always zero. With recycler ON, this periodically takes 80-120 ms. Interesting...
Looking at
clean
andshrink
overall, the total times for each appear to be mostly the same.Footnotes
https://github.com/solana-labs/solana/issues/35376 ↩
https://github.com/solana-labs/solana/issues/35431 ↩
https://github.com/solana-labs/solana/pull/12885 ↩
https://discord.com/channels/428295358100013066/838890116386521088/1215020398647705630 ↩