Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removes the storage recycler #118

Merged
merged 1 commit into from
Mar 11, 2024

Conversation

brooksprumo
Copy link

@brooksprumo brooksprumo commented Mar 6, 2024

Problem

The storage recycler is no longer necessary. It is the cause of this fastboot issue1, and prevents this fastboot usage2.

Summary of Changes

Remove the storage recycler

More information

In the PR that added the recycler3, it mentions two motivations:

  • file/mmap creation can be expensive - 100s of ms
  • mmap deletion causes a flush to disk, which is unnecessary if the data is not needed

Now that we have the accounts write cache, only roots are written to storages. IOW, we don't have storages for slots on a minority fork that got discarded. We also don't "append" to (non-ancient) append vecs anymore. I think this means we won't ever flush a non-root, and we won't have unflushed data in a storage.

I posted graphs of metrics over on Discord4 that showed performance metrics w.r.t. dropping storages and creating storages. There are more metrics at the end of this description too. Tl;dr: occasionally this PR takes a while to create a new mmap file. Mmap deletion seems to be faster with this PR.

I think this is an overall win. The code gets simpler. We unlock a desirable use-case. And we can remove 'atomic' from the AccountStorageEntry slot and id fields. Creating an mmap did get slower, once out of an hour window. This is in the background, and the overall flush/clean/shrink metrics did not seem to be negatively impacted.

More Metrics

  • There are two nodes in these metrics
    • brux is this PR, aka recycler OFF
    • dev7 is master (the commit prior to this PR), aka recycler ON
  • This is on mnb, over 1 hour

The number of dropped storages, with the values summed. Both nodes appear the same.

num dropped storages (sum)


Time to create storages, during shrink. Also the sum. (I have 'min' and 'max' graphs too, but they look similar.) Presumably this is when recycler ON should be a big win. We do see that this does spike when the recycler is OFF.

shrink - create and insert storages - time (sum)


Time to drop storages, during shrink, summed. Basically zero.

shrink - drop storages - time (sum)


Time to drop storages, during clean, summed. With recycler OFF, this is almost always zero. With recycler ON, this periodically takes 80-120 ms. Interesting...

clean - drop storages - time (sum)


Looking at clean and shrink overall, the total times for each appear to be mostly the same.

clean - overall time (max)

shrink - overall time (max)

Footnotes

  1. https://github.com/solana-labs/solana/issues/35376

  2. https://github.com/solana-labs/solana/issues/35431

  3. https://github.com/solana-labs/solana/pull/12885

  4. https://discord.com/channels/428295358100013066/838890116386521088/1215020398647705630

@brooksprumo brooksprumo self-assigned this Mar 6, 2024
Comment on lines +3026 to +3028
// NOTE: The recycler has been removed. Creating this many extra storages is no longer
// necessary, but also does no harm either.
const MAX_RECYCLE_STORES: usize = 1000;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The MAX_RECYCLE_STORES is now only used in this test. In an effort to create a small PR, I have not updated the test here.

Comment on lines -242 to -247
let (_, drop_or_recycle_stores_measure) = measure!(
self.accounts_db()
.drop_or_recycle_stores(dead_storages, &self.accounts_db().shrink_stats),
"drop or recycle stores"
);
info!("{drop_or_recycle_stores_measure}");
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The drop_or_recycle_stores() function has been removed, so we call drop explicitly now.

@@ -3957,9 +3854,15 @@ impl AccountsDb {
);
}

self.drop_or_recycle_stores(dead_storages, stats);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function was removed, so we call drop directly now.

Comment on lines -4151 to -4178
pub fn drop_or_recycle_stores(
&self,
dead_storages: Vec<Arc<AccountStorageEntry>>,
stats: &ShrinkStats,
) {
let mut recycle_stores_write_elapsed = Measure::start("recycle_stores_write_time");
let mut recycle_stores = self.recycle_stores.write().unwrap();
recycle_stores_write_elapsed.stop();

let mut drop_storage_entries_elapsed = Measure::start("drop_storage_entries_elapsed");
if recycle_stores.entry_count() < MAX_RECYCLE_STORES {
recycle_stores.add_entries(dead_storages);
drop(recycle_stores);
} else {
self.stats
.dropped_stores
.fetch_add(dead_storages.len() as u64, Ordering::Relaxed);
drop(recycle_stores);
drop(dead_storages);
}
drop_storage_entries_elapsed.stop();
stats
.drop_storage_entries_elapsed
.fetch_add(drop_storage_entries_elapsed.as_us(), Ordering::Relaxed);
stats
.recycle_stores_write_elapsed
.fetch_add(recycle_stores_write_elapsed.as_us(), Ordering::Relaxed);
}
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When not recycling, this function boils down to:

  • drop the storages
  • update stats

We inline this into the one real caller above.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

Yah. Probably we can save 200us.

@@ -5831,32 +5612,26 @@ impl AccountsDb {
.safety_checks_elapsed
.fetch_add(safety_checks_elapsed.as_us(), Ordering::Relaxed);

let mut total_removed_storage_entries = 0;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a duplicate of num_stored_slots_removed, so we just use that one instead for the stats.

Comment on lines +5650 to +5652
self.stats
.dropped_stores
.fetch_add(num_stored_slots_removed as u64, Ordering::Relaxed);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This stat update used to live in recycle_slot_stores() (which was removed).

@codecov-commenter
Copy link

Codecov Report

Attention: Patch coverage is 90.00000% with 2 lines in your changes are missing coverage. Please review.

Project coverage is 81.8%. Comparing base (c161351) to head (e7fbb6c).
Report is 2 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff            @@
##           master     #118     +/-   ##
=========================================
- Coverage    81.8%    81.8%   -0.1%     
=========================================
  Files         838      838             
  Lines      225947   225700    -247     
=========================================
- Hits       184971   184717    -254     
- Misses      40976    40983      +7     

@jeffwashington
Copy link

First pass review looks good to me. I'm going to use a finer tooth comb and go over it again and think about this for a while.

@HaoranYi
Copy link

HaoranYi commented Mar 7, 2024

Overall looks good.

image

From the stats on mainnet, it looks like the recycle store is almost always full with 1K entry. I guess this means that not many of the entries in the store got recycled?

Do we have any other metric that directly count how many stores are recycled?

Copy link

@jeffwashington jeffwashington left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm
super excited to reduce complexity and get rid of this whole concept. It always attacks suspicion and will simplify the model of how to think about storages.

@brooksprumo
Copy link
Author

Do we have any other metric that directly count how many stores are recycled?

Here's metrics from a 15 minute interval for all of the canaries running on mnb:

Screenshot 2024-03-07 at 4 11 31 PM

The spikes are likely for each run of flush/clean/shrink, every 100 or so slots.

@brooksprumo brooksprumo requested a review from sakridge March 7, 2024 22:19
@HaoranYi
Copy link

HaoranYi commented Mar 7, 2024

Do we have any other metric that directly count how many stores are recycled?

Here's metrics from a 15 minute interval for all of the canaries running on mnb:

Screenshot 2024-03-07 at 4 11 31 PM

The spikes are likely for each run of flush/clean/shrink, every 100 or so slots.

Ok. Looks like we don't recycle much, ie 10%.

// Backing mmaps for removed storages entries explicitly dropped here outside
// of any locks
let mut drop_storage_entries_elapsed = Measure::start("drop_storage_entries_elapsed");
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: measure!(...) as we did for other places.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I could do that. I was trying to keep any extraneous changes to a minimum. Since it's not a correctness issue, I'd prefer to not tackle that change here in this PR though.

Copy link

@HaoranYi HaoranYi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm.

@brooksprumo
Copy link
Author

@sakridge Requesting your review to ensure there's not any other gotchas/corner cases I've missed. Since you added the recycler, I figured your input would be valuable. I'm happy to look up any additional metrics too.

@sakridge
Copy link

Now that we have the accounts write cache, only roots are written to storages. IOW, we don't have storages for slots on a minority fork that got discarded.

Ideally all slots are roots or a very high percentage, so I'm not sure how having the write cache only write roots should change much in terms of file creation assuming we still create new files for each slot.

I think we do write the whole slot to a single file now instead of potentially multiple files as before, so maybe that would reduce the average number of files created.

@brooksprumo
Copy link
Author

[..] I'm not sure how having the write cache only write roots should change much in terms of file creation assuming we still create new files for each slot.

My understanding was that before the write cache was added, append vecs were created for every slot, before rooting. Yes, most would get rooted, but not all. So some append vecs would get dropped sooner because their slot was pruned due to being on a minority fork.

I think we do write the whole slot to a single file now instead of potentially multiple files as before, so maybe that would reduce the average number of files created.

Yes, now we write-all instead of append. If there is only a single account path, then I think the number of storages would be the same. With multiple account path, then yes, before there could be multiple storages per slot. Combined with the part above, dropped slots on a minority fork could drop more storages if there were many account paths.

@brooksprumo
Copy link
Author

I've updated the PR's description and added a bunch of metrics under the "More Metrics" details dropdown.

Copy link

@sakridge sakridge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, from the metrics seems to be not helping much given the save in complexity and the drop overhead. Thanks adding those additional ones.

@brooksprumo
Copy link
Author

Ok, from the metrics seems to be not helping much given the save in complexity and the drop overhead.

I'll keep monitoring these metrics going forward to see if there are any regressions.

@brooksprumo brooksprumo merged commit 5c1df15 into anza-xyz:master Mar 11, 2024
35 checks passed
@brooksprumo brooksprumo deleted the recycle/remove branch March 11, 2024 19:38
codebender828 pushed a commit to codebender828/agave that referenced this pull request Oct 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants