-
Notifications
You must be signed in to change notification settings - Fork 338
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve unified scheduler pipelining by chunking #2882
base: master
Are you sure you want to change the base?
Conversation
d4f3246
to
3e5555e
Compare
@@ -3501,6 +3501,66 @@ impl Blockstore { | |||
Ok((entries, num_shreds, slot_meta.is_full())) | |||
} | |||
|
|||
pub fn get_chunked_slot_entries_in_block( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
write tests?
ledger/src/blockstore_processor.rs
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
best viewed in hide-whitespace mode diff. :)
ledger/src/blockstore.rs
Outdated
let keys = (start..=end).map(|index| (slot, u64::from(index))); | ||
let range_shreds = self | ||
.data_shred_cf | ||
.multi_get_bytes(keys) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i wonder how much overhead it incurs here instead of one-giant invocation of .multi_get_bytes()
.
c631208
to
6469bc9
Compare
After #2881, the bench crossed the 13-sec wall:
for reference here's recent blockstore-processor numbers:
|
6469bc9
to
36dce60
Compare
note to self: this pr is kind of stale.. currently it's known this chunked behavior adversely affects entry verification rayon thread group. |
Problem
Unlike blockstore-processor, unified scheduler can start processing transaction async as soon as it is fed with transactions. However, entries are currently fed in one large sweep when both catching-after-full-repair and normal replaying stage (the notorious 100ms):
agave/core/src/replay_stage.rs
Line 1149 in 34e9932
Summary of Changes
Introduce chunked entry load api in Blockstore and use it only for unified scheduler. Hence, subsequent deshredding can overlap in time with already-submitted unified scheduler processing, improving pipeline efficiency.
Lastly, note that this is optimization only applicable to block verification. block production by unified scheduler won't benefit at all. (however, some measurable gain for block verification, hence a this pr)
before
after
maybe 4-5% gain
Extracted from: #2325