validator: Split PoH speed measurement from check #4185

steviez · 2024-12-19T20:05:57Z

Problem

Checking the PoH speed requires a Bank to derive the cluster hash rate as of #2447. By the time a Bank is available, many services have started up and are competing for CPU time.

Summary of Changes

So, split the PoH speed check. Measure the speed very early on in Validator::new(), and check the value later once a Bank is available

Without the change:

[...] PoH speed check: computed hashes per second 12663376, target hashes per second 10000000
[...] PoH speed check: computed hashes per second 15573125, target hashes per second 10000000
[...] PoH speed check: computed hashes per second 15448481, target hashes per second 10000000
[...] PoH speed check: computed hashes per second 11727495, target hashes per second 10000000
[...] PoH speed check: computed hashes per second 15355731, target hashes per second 10000000
[...] PoH speed check: computed hashes per second 15276260, target hashes per second 10000000
[...] PoH speed check: computed hashes per second 12747872, target hashes per second 10000000
[...] PoH speed check: computed hashes per second 15319899, target hashes per second 10000000
[...] PoH speed check: computed hashes per second 15462806, target hashes per second 10000000
[...] PoH speed check: computed hashes per second 15666179, target hashes per second 10000000

With the change:

[...] PoH speed check: computed hashes per second 15842486, target hashes per second 10000000
[...] PoH speed check: computed hashes per second 15763624, target hashes per second 10000000
[...] PoH speed check: computed hashes per second 15841138, target hashes per second 10000000
[...] PoH speed check: computed hashes per second 15354805, target hashes per second 10000000
[...] PoH speed check: computed hashes per second 15850004, target hashes per second 10000000
[...] PoH speed check: computed hashes per second 15839109, target hashes per second 10000000
[...] PoH speed check: computed hashes per second 15842484, target hashes per second 10000000
[...] PoH speed check: computed hashes per second 15676336, target hashes per second 10000000
[...] PoH speed check: computed hashes per second 15790533, target hashes per second 10000000
[...] PoH speed check: computed hashes per second 15844216, target hashes per second 10000000

As you can see, a lot more variation without this change

Checking the PoH speed requires a Bank to derive the cluster hash rate. By the time a Bank is available, many services have started up and are competing for CPU time. So, split the PoH speed check. Measure the speed very early on in Validator::new(), and check the value later once a Bank is available

steviez · 2025-02-03T22:34:08Z

The change shows clear improvement for production case. However, it is problematic for test situations such as LocalCluster. On a LocalCluster test where we spin up several validators, doing 10M hashes for each of them is non-trivial. Not to mention that the test might not be built in --release mode which means it will crawl to do 10M hashes / validators.

A few alternatives I see:

Try to skip the speed check in all test scenarios
- This seems error prone / we are unlikely to catch them all
On top of this PR, unpack genesis and choose how many calculations we'll run depending on genesis.cluster_type
- Do 10M for MNB/testnet/devnet; use the value from genesis otherwise
- This will keep the split to measure early on in startup, and compare against Bank later on
Open to ideas ..?

I'm leaning towards the option to use 10M for our clusters and fallback to genesis for Development. I don't love the cluster-specific behavior, but don't see a great way around it right now. I'll think on this a little more but I am certainly open to ideas too

core/src/validator.rs

t-nelson · 2025-02-04T02:34:37Z

core/src/validator.rs

+// If the hash rate on these clusters changes, we might consider updating this
+// constant. However, the PoH speed check compares hashes / second so this
+// constant does not have to be updated
+const POH_SPEED_CHECK_NUM_HASHES: u64 = 10_000_000;


keeping this in sync seems a little fraught. if we don't really care about the cluster value(s) and we're out of the critical path, would it make sense just just choose a much larger number so that we get a more accurate measure?

might make sense to eventually stuff this into a separate snapshot field so we can peek it out instead of waiting for a bank

A much larger number would hypothetically get us a more accurate number at the cost of time spent performing the calculation. Ie, a node that just meets spec would spent 1s hashing; 10x'ing this number 10x's that time spent (10s or 25 slots).

I think the main point of my comment was that since we normalize to hashes per second, keeping the number up to date isn't critical. Maybe the consideration would be if we hit 100M hashes/second then it 10M hashes (0.1) is too short of a period and we see significant jitter.

Any thoughts on the issue I called out in #4185 (comment)

should we reference UPDATED_HASHES_PER_SECOND_6? https://github.com/anza-xyz/agave/blob/master/sdk/clock/src/lib.rs#L61

Ohh yeah, and that is a public constant; I'll use that directly

UPDATED_HASHES_PER_SECOND_6 is a cluster configuration parameter and should by no means be public in the first place despite our rampant abuse of similar patterns. please do not add more cases that we need to fix

Any thoughts on the issue I called out in #4185 (comment)

yeah. i think local cluster is a scourge and needs to die as soon as possible. milady

A much larger number would hypothetically get us a more accurate number at the cost of time spent performing the calculation. Ie, a node that just meets spec would spent 1s hashing; 10x'ing this number 10x's that time spent (10s or 25 slots).

I think the main point of my comment was that since we normalize to hashes per second, keeping the number up to date isn't critical. Maybe the consideration would be if we hit 100M hashes/second then it 10M hashes (0.1) is too short of a period and we see significant jitter.

i guess what i was getting at is that the comment is kinda ambivalent and like a "oh yeah hey maybe keep an eye on this" whereas ideally it'd either be direct and enforced with a static assertion or we choose a value that's sufficiently large to get us a good hashes/second measurement, regardless the cluster config

ideally it'd either be direct and enforced with a static assertion

Agreed, although in our case, the "correct" value is pulled from a Bank at runtime. Nothing we can check at compile time so not much we can do here

or we choose a value that's sufficiently large to get us a good hashes/second measurement, regardless the cluster config

Anecdotally, I think the max value I've seen in Discord is something like 20M-25M hashes per second. Assuming 25M hashes/second became the new cluster hash rate, 10M hashes is still 1 slot worth which is what the test previously ran with. So, I think the 10M hashes is a reasonable balance of do-not-spend-too-much-time and number-that-is-somewhat-future-proof (on the order of years given recent trajectory)

bw-solana · 2025-02-04T15:44:14Z

I'm fine just skipping the PoH check for Development cluster type. Half-assing it for these cluster types seems like we will either have to set the bar low enough that it doesn't catch anything or it will be flaky. Better to go full-ass or no-ass

I have mixed thoughts on the overall decoupling...

On the one hand, you get much more consistent results measuring PoH before the other services start up.
On the other hand, you'll have to actually complete the hashes while other services are running, so your performance in that environment is more important.
Isn't PoH supposed to pin a core? So why is there so much variability?

steviez · 2025-02-04T16:53:10Z

On the other hand, you'll have to actually complete the hashes while other services are running, so your performance in that environment is more important.

Isn't PoH supposed to pin a core? So why is there so much variability?

PohService does run when everything is turned on AND does pin to a core; the speed check does not. Moving the speed check to happen inside PohService would butcher the API a little, since the pinning occurs in a separate thread. We'd need a channel or something to get the hash rate out.

We could potentially adjust the speed check to create a new thread, set affinity, run speed check, and immediately join on that thread. This would happen after the Bank has been unpacked, so we could use the "correct" value derived from a Bank again.

I don't believe that pinning this short-lived PoH speed check thread to the same core as PohService would have any persisting effects on PohService. That is, pinning a thread that exits early shouldn't inhibit the other thread that we pin in the same way

bw-solana · 2025-02-04T16:56:23Z

On the other hand, you'll have to actually complete the hashes while other services are running, so your performance in that environment is more important.

Isn't PoH supposed to pin a core? So why is there so much variability?

PohService does run when everything is turned on AND does pin to a core; the speed check does not. Moving the speed check to happen inside PohService would butcher the API a little, since the pinning occurs in a separate thread. We'd need a channel or something to get the hash rate out.

We could potentially adjust the speed check to create a new thread, set affinity, run speed check, and immediately join on that thread. This would happen after the Bank has been unpacked, so we could use the "correct" value derived from a Bank again.

I don't believe that pinning this short-lived PoH speed check thread to the same core as PohService would have any persisting effects on PohService. That is, pinning a thread that exits early shouldn't inhibit the other thread that we pin in the same way

I see, makes sense. In this way, the change to run before other services start probably provides a hash rate more reflective of reality (when PoH core is pinned), right?

A comment by the PoH speed check summarizing this reasoning would probably be helpful.

steviez · 2025-02-04T17:06:55Z

I see, makes sense. In this way, the change to run before other services start probably provides a hash rate more reflective of reality (when PoH core is pinned), right?

Correct; there should be little to no "competition" for CPU time since there shouldn't be a glob of threads created yet.

A comment by the PoH speed check summarizing this reasoning would probably be helpful.

I had a comment with the intent to say this, but I apparently didn't finish writing it 😅 :

// Run the speed check as early as possible to

Assuming I finish the comment and given what you know now, are you still in favor of the split approach with the noop for cluster_type == Development ?

bw-solana · 2025-02-04T17:13:19Z

Assuming I finish the comment and given what you know now, are you still in favor of the split approach with the noop for cluster_type == Development ?

Yes 👍

steviez

There is a comment about why the measure/check is split in the comments for check_poh_speed(); but I decided to include a short note about where we actually call measure() in Validator::new(). The extra visibility will hopefully prevent others from shuffling it around

core/src/validator.rs

bw-solana

Called out a few minor things

bw-solana · 2025-02-04T23:41:12Z

core/src/validator.rs

+// If the hash rate on these clusters changes, we might consider updating this
+// constant. However, the PoH speed check compares hashes / second so this
+// constant does not have to be updated
+const POH_SPEED_CHECK_NUM_HASHES: u64 = 10_000_000;


should we reference UPDATED_HASHES_PER_SECOND_6? https://github.com/anza-xyz/agave/blob/master/sdk/clock/src/lib.rs#L61

bw-solana · 2025-02-04T23:43:06Z

core/src/validator.rs

+        //
+        // Skip the check for Development clusters as this will be the type for
+        // tests or for clusters that we don't know what the rate will be
+        // without a rebuilt Bank


isn't the real reason that we don't want to pay the time cost for local cluster tests and such?
Because we end up pulling the rate from the root bank later

Kind of, a couple things a

Unit tests are run in debug mode ... doing 10M hashes in debug mode on local machine, potentially for multiple tests in parallel is not a good time

You're correct that we pull the value from the Bank later, and because of that, we don't necessarily want to spend the time to hash 10M hashes (in hopefully a second) if the genesis config dictates we only need to be able to hit 100k.

I can look at tweaking the comment a bit more

bw-solana · 2025-02-04T23:47:12Z

core/src/validator.rs

+        // tests or for clusters that we don't know what the rate will be
+        // without a rebuilt Bank
+        let my_poh_hashes_per_second =
+            match (genesis_config.cluster_type, !config.no_poh_speed_test) {


matching !no with false arm breaks my brain

Ha, I initially did the following but cleared it

let do_speed_test = !config.no_speed_test let my_poh_hashes_per_second = match ...

Naming a separate variable makes the not more visible so can switch it back if that's agreeable

ha, I'm cool with whatever

steviez force-pushed the val_split_poh_speed_check branch from e851f08 to be4ac16 Compare December 19, 2024 20:10

steviez force-pushed the val_split_poh_speed_check branch from be4ac16 to 3d1b6b2 Compare February 3, 2025 22:24

steviez requested review from t-nelson and bw-solana February 3, 2025 22:34

t-nelson reviewed Feb 4, 2025

View reviewed changes

steviez added 3 commits February 4, 2025 12:20

finish comment about why speed check is split and run early

eb8acf8

Move load genesis higher up - we need it earlier now

5dc9a80

Skip speed check for cluster type Development

ac73c67

steviez commented Feb 4, 2025

View reviewed changes

core/src/validator.rs Outdated Show resolved Hide resolved

bw-solana reviewed Feb 4, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

validator: Split PoH speed measurement from check #4185

validator: Split PoH speed measurement from check #4185

steviez commented Dec 19, 2024 •

edited

Loading

steviez commented Feb 3, 2025

t-nelson Feb 4, 2025

steviez Feb 4, 2025

steviez Feb 4, 2025

bw-solana Feb 4, 2025

steviez Feb 4, 2025

t-nelson Feb 5, 2025

steviez Feb 5, 2025

bw-solana commented Feb 4, 2025

steviez commented Feb 4, 2025

bw-solana commented Feb 4, 2025

steviez commented Feb 4, 2025

bw-solana commented Feb 4, 2025

steviez left a comment

bw-solana left a comment

bw-solana Feb 4, 2025

bw-solana Feb 4, 2025

steviez Feb 4, 2025

bw-solana Feb 4, 2025

steviez Feb 4, 2025

bw-solana Feb 5, 2025

validator: Split PoH speed measurement from check #4185

Are you sure you want to change the base?

validator: Split PoH speed measurement from check #4185

Conversation

steviez commented Dec 19, 2024 • edited Loading

Problem

Summary of Changes

steviez commented Feb 3, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bw-solana commented Feb 4, 2025

steviez commented Feb 4, 2025

bw-solana commented Feb 4, 2025

steviez commented Feb 4, 2025

bw-solana commented Feb 4, 2025

steviez left a comment

Choose a reason for hiding this comment

bw-solana left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

steviez commented Dec 19, 2024 •

edited

Loading