Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase the maximum of ancient slots allowed to be not combined #3205

Merged
merged 7 commits into from
Oct 23, 2024

Conversation

dmakarov
Copy link

Problem

Large files containing ancient slots are expensive to update when the contained account data is being overwritten in more recent slots.

Summary of Changes

We increase the value of the tuning parameter max_ancient_slots to allow more ancient slots to remain in their individual storages instead of being combined into larger files. The smaller files are faster to update at the expense of the validator using more file handles.

@dmakarov
Copy link
Author

I'm working on adding an option to set this parameter on the validator's command line.

@jeffwashington
Copy link

I suggest coupling this change with:
ANCIENT_APPEND_VEC_DEFAULT_OFFSET = 100_000

This keeps the overall 'ancient' range at 100k max, and the overall storage range at 432k, so we are at status quo relative today regarding # slots and # file handles.

HaoranYi
HaoranYi previously approved these changes Oct 17, 2024
Copy link

@HaoranYi HaoranYi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

max file handle = 430K + 100K + 100K = 630K.
not too bad.
lgtm!

@jeffwashington
Copy link

max file handle = 430K + 100K + 100K = 630K. not too bad. lgtm!

I think 100k means move it into the 432k. I think -10k moves it past the end (so max of 432k - - 10k = 442k.

I'm proposing 432k - 100k + 100k = 432k total.

@HaoranYi
Copy link

Ah. I see.

So we will update this in the next PR?

const ANCIENT_APPEND_VEC_DEFAULT_OFFSET: Option<i64> = Some(-10_000);

@jeffwashington
Copy link

Ah. I see.

So we will update this in the next PR?

const ANCIENT_APPEND_VEC_DEFAULT_OFFSET: Option<i64> = Some(-10_000);

I wanted that done IN this pr:

#3205 (comment)

I guess I could've been more clear.

@dmakarov
Copy link
Author

Ah. I see.

So we will update this in the next PR?

const ANCIENT_APPEND_VEC_DEFAULT_OFFSET: Option<i64> = Some(-10_000);

in this one.

@@ -595,7 +595,7 @@ pub struct AccountsAddRootTiming {

/// if negative, this many accounts older than # slots in epoch are still treated as modern (ie. non-ancient).
/// Slots older than # slots in epoch - this # are then treated as ancient and subject to packing.
const ANCIENT_APPEND_VEC_DEFAULT_OFFSET: Option<i64> = Some(-10_000);
const ANCIENT_APPEND_VEC_DEFAULT_OFFSET: Option<i64> = Some(-100_000);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this is + 100_000

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't it be an offset to smaller slot numbers? if the offset is positive, the logic of several tests has to be adjusted at least, maybe some logic in the algorithms as well. let me figure this out.

@dmakarov dmakarov marked this pull request as ready for review October 18, 2024 15:15
@jeffwashington
Copy link

we definitely need to start a skipping rewrites version of this and it needs to run more than an epoch. Skipping rewrites is still opt-in and the current master behavior is unacceptable, so even if this isn't completely right, the risk is low at the moment.

@dmakarov
Copy link
Author

maybe I should merge #3206, rebase this on new master, and then start a skipping rewrites validator? so that all tweaks are included...

@jeffwashington
Copy link

maybe I should merge #3206, rebase this on new master, and then start a skipping rewrites validator? so that all tweaks are included...

@dmakarov yes, I just tracked these down. 3206 is approved. I say merge it.

accounts-db/src/accounts_db.rs Show resolved Hide resolved
accounts-db/src/accounts_db.rs Outdated Show resolved Hide resolved
@dmakarov dmakarov requested a review from brooksprumo October 18, 2024 19:44
@dmakarov
Copy link
Author

Is there something we're waiting for this PR? I think I addressed all the comments.

Copy link

@HaoranYi HaoranYi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link

@brooksprumo brooksprumo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Happy to approve once #3266 is merged.

@dmakarov
Copy link
Author

Looks good to me. Happy to approve once #3266 is merged.

it's merged.

@dmakarov dmakarov requested a review from brooksprumo October 22, 2024 19:49
@jeffwashington
Copy link

Is there something we're waiting for this PR? I think I addressed all the comments.

My testing found that we can get stuck when I enabled this change. We are not reducing alive_bytes when we encounter multi ref storages we can't pack yet.
We also aren't shrinking every ancient storage with dead bytes - just the ones < 90% alive.
I think both of those need to go in or we may find ourselves in a looming oom situation like I did where we fail to make progress.

@dmakarov
Copy link
Author

Is there something we're waiting for this PR? I think I addressed all the comments.

My testing found that we can get stuck when I enabled this change. We are not reducing alive_bytes when we encounter multi ref storages we can't pack yet. We also aren't shrinking every ancient storage with dead bytes - just the ones < 90% alive. I think both of those need to go in or we may find ourselves in a looming oom situation like I did where we fail to make progress.

so this depends on #3251 and #3252 then.

Copy link

@jeffwashington jeffwashington left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. dependencies are merged

@dmakarov dmakarov merged commit c3db8a4 into anza-xyz:master Oct 23, 2024
40 checks passed
@dmakarov dmakarov deleted the shrink2 branch October 23, 2024 19:48
ray-kast pushed a commit to abklabs/agave that referenced this pull request Nov 27, 2024
…a-xyz#3205)

* Increase the maximum of ancient slots allowed to be not combined

* OFFSET

* Offset

* Fix

* Fix

* Roots

* Comment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants