Sets write version to 0 when storing accounts #476

brooksprumo · 2024-03-28T19:24:30Z

Problem

We no longer allow more than one append vec per slot, which obviates the need for a write version. A higher slot will always have the latest version of an account.

We do not want to store write version in the storage itself, but we cannot remove the field entirely in append vecs. And we'd like to make snapshots a bit more compressible by using 0 for all write versions.

Summary of Changes

When storing accounts, always set the write version to zero.

NOTE: In commit two we still create a write version producer for geyser that increments write version.

Subsequent PRs will remove more pieces of the write version concept.

accounts-db/src/accounts_db.rs

brooksprumo · 2024-03-28T20:04:45Z

One additional thing to consider, how will this impact geyser?

agave/geyser-plugin-interface/src/geyser_plugin_interface.rs

Lines 103 to 108 in f36a45a

    
               /// A global monotonically increasing atomic number, which can be used 
        
               /// to tell the order of the account update. For example, when an 
        
               /// account is updated in the same slot multiple times, the update 
        
               /// with higher write_version should supersede the one with lower 
        
               /// write_version. 
        
               pub write_version: u64,

codecov-commenter · 2024-03-28T21:00:37Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 81.8%. Comparing base (182d27f) to head (5a86917).
Report is 5 commits behind head on master.

Additional details and impacted files

@@           Coverage Diff           @@
##           master     #476   +/-   ##
=======================================
  Coverage    81.8%    81.8%           
=======================================
  Files         842      842           
  Lines      228492   228481   -11     
=======================================
- Hits       187104   187098    -6     
+ Misses      41388    41383    -5

jeffwashington · 2024-03-29T21:40:59Z

One additional thing to consider, how will this impact geyser?

@lijunwangs you helped me test a similar change against geyser a few years ago :( Can you please help again? I think if we just keep passing an increasing # to geyser as we load, geyser will be happy.

lijunwangs · 2024-03-29T22:00:35Z

One additional thing to consider, how will this impact geyser?

@lijunwangs you helped me test a similar change against geyser a few years ago :( Can you please help again? I think if we just keep passing an increasing # to geyser as we load, geyser will be happy.

Yes, I think to keep geyser interface consistent, we need to increase the write_version number on the fly. I would be happy to test this out if it works. Do we already have the change? @jeffwashington

HaoranYi · 2024-04-01T13:24:17Z

I believe that write_version, unlike account_hash, is not used in consensus. Therefore, we don't need to feature gate it.
Is that right?

brooksprumo · 2024-04-01T13:55:18Z

I believe that write_version, unlike account_hash, is not used in consensus. Therefore, we don't need to feature gate it. Is that right?

Correct, write version is not used in consensus, so no feature gate is required.

Also, writing the account hash to disk in an append vec does not impact consensus either. Obviously we still need the account hashes as part of the accounts delta hash in each block, but we don't need to persist the account hashes as we currently do.

HaoranYi · 2024-04-01T14:13:44Z

Also, writing the account hash to disk in an append vec does not impact consensus either. Obviously we still need the account hashes as part of the accounts delta hash in each block, but we don't need to persist the account hashes as we currently do.

For writing default account hash to storage, I think we need a feature gate. For the reading part (i.e. recompute if missing), we don't need feature gate. But for writing part (i.e. write default hash), we need a feature gate. The writing part can only be enabled when the majority of the nodes in the network have the code of the reading part. Otherwise, there is a risk that nodes without the reading changes, which downloaded a snapshot produced by the node that writes default hash, will crash. Is that right?

brooksprumo · 2024-04-01T14:24:54Z

For writing default account hash to storage, I think we need a feature gate. For the reading part (i.e. recompute if missing), we don't need feature gate. But for writing part (i.e. write default hash), we need a feature gate. The writing part can only be enabled when the majority of the nodes in the network have the code of the reading part. Otherwise, there is a risk that nodes without the reading changes, which downloaded a snapshot produced by the node that writes default hash, will crash. Is that right?

Ah, I see what you're getting at. I don't think a feature gate is the right solution here, since we really need some forwards/backwards compatibility between Agave versions. This will be similar to how we handle adding new fields to snapshots. Some version x.y (e.g. v2.0.0) will likely need to keep writing account hashes to append vecs, and at the same time, will stop reading account hashes and instead recompute the account hash each time it is needed. Then in version x.y+1 (e.g. v2.1.0) it will be able to stop writing account hashes to append vecs (well, really writing the default account hash value). This should allow a beta and stable version to coexist and reuse eachothers' snapshots.

brooksprumo · 2024-04-01T14:33:27Z

Yes, I think to keep geyser interface consistent, we need to increase the write_version number on the fly. I would be happy to test this out if it works.

@lijunwangs I've pushed another commit that intends to retain incrementing the write version for geyser. Can you test this PR now, or let me know how to test it? Thanks!

HaoranYi · 2024-04-01T14:33:41Z

Also, writing the account hash to disk in an append vec does not impact consensus either. Obviously we still need the account hashes as part of the accounts delta hash in each block, but we don't need to persist the account hashes as we currently do.

For writing default account hash to storage, I think we need a feature gate for the writing part. For the reading part, we don't need feature gate. But for writing part, we

For writing default account hash to storage, I think we need a feature gate. For the reading part (i.e. recompute if missing), we don't need feature gate. But for writing part (i.e. write default hash), we need a feature gate. The writing part can only be enabled when the majority of the nodes in the network have the code of the reading part. Otherwise, there is a risk that nodes without the reading changes, which downloaded a snapshot produced by the node that writes default hash, will crash. Is that right?

Ah, I see what you're getting at. I don't think a feature gate is the right solution here, since we really need some forwards/backwards compatibility between Agave versions. This will be similar to how we handle adding new fields to snapshots. Some version x.y (e.g. v2.0.0) will likely need to keep writing account hashes to append vecs, and at the same time, will stop reading account hashes and instead recompute the account hash each time it is needed. Then in version x.y+1 (e.g. v2.1.0) it will be able to stop writing account hashes to append vecs (well, really writing the default account hash value). This should allow a beta and stable version to coexist and reuse eachothers' snapshots.

Ok. That will work. I will split my PR into two. One for reading and one for writing. The reading PR can be merged now. And the witing PR will wait till one more release cycle.

brooksprumo · 2024-04-01T14:39:58Z

accounts-db/src/accounts_db.rs

+        let mut write_version_producer: Box<dyn Iterator<Item = u64>> =
+            if self.accounts_update_notifier.is_some() {
+                let mut current_version = self
+                    .write_version
+                    .fetch_add(accounts_and_meta_to_store.len() as u64, Ordering::AcqRel);
+                Box::new(std::iter::from_fn(move || {
+                    let ret = current_version;
+                    current_version += 1;
+                    Some(ret)
+                }))
+            } else {
+                Box::new(std::iter::empty())
+            };


The write_verison_producer is only used by notify_account_at_accounts_update(), below. This function only does anything when geyser is enabled (i.e. accounts_update_notifier is SOME). If geyser is disabled, then the write version producer is not used. Hence why the else case here does nothing.

And in the if case, we continue to increment write version as before, since geyser needs that behavior.

A nice fallout is that we can remove the write version producer from all function signatures now.

brooksprumo · 2024-04-01T14:42:33Z

accounts-db/src/accounts_db.rs

-                    let write_versions = (0..accounts.len())
-                        .map(|_| write_version_producer.next().unwrap())
-                        .collect::<Vec<_>>();
+                    let write_versions = vec![0; accounts.len()];


For now, we set the write versions here to a vector of zeroes. Subsequent PRs will remove the AndWriteVersion part from StorableAccountsWithHashesAndWriteVersions, which will allow removing this vector.

jeffwashington

lgtm

HaoranYi

lgtm

brooksprumo · 2024-04-02T12:36:06Z

Darn. The renames from #447 caused a merge conflict on this PR. I've rebased and force-pushed to resolve the conflict. No other changes were made.

@HaoranYi @jeffwashington Can you please re-review?

jeffwashington

lgtm

HaoranYi

lgtm.
looks like there is a conflict.

brooksprumo · 2024-04-02T14:12:08Z

looks like there is a conflict.

Ugh, again. You're right 😢

brooksprumo · 2024-04-02T14:15:08Z

I've rebased and force-pushed to resolve conflicts from #344. No other changes were made.

@HaoranYi @jeffwashington Can you please re-review?

brooksprumo self-assigned this Mar 28, 2024

brooksprumo commented Mar 28, 2024

View reviewed changes

accounts-db/src/accounts_db.rs Show resolved Hide resolved

accounts-db/src/accounts_db.rs Outdated Show resolved Hide resolved

brooksprumo marked this pull request as ready for review March 28, 2024 20:01

brooksprumo requested review from HaoranYi and jeffwashington March 28, 2024 20:05

brooksprumo commented Apr 1, 2024

View reviewed changes

HaoranYi mentioned this pull request Apr 1, 2024

Write default hash to account storage #469

Merged

jeffwashington previously approved these changes Apr 1, 2024

View reviewed changes

HaoranYi previously approved these changes Apr 1, 2024

View reviewed changes

brooksprumo dismissed stale reviews from HaoranYi and jeffwashington via 72db69c April 2, 2024 12:20

brooksprumo force-pushed the write-version/remove-producer branch from ad35199 to 72db69c Compare April 2, 2024 12:20

brooksprumo requested review from HaoranYi and jeffwashington April 2, 2024 12:35

jeffwashington previously approved these changes Apr 2, 2024

View reviewed changes

HaoranYi previously approved these changes Apr 2, 2024

View reviewed changes

brooksprumo added 2 commits April 2, 2024 10:12

Sets write version to 0 when storing accounts

429c876

retain producing correct write versions for geyser

90e134d

brooksprumo dismissed stale reviews from HaoranYi and jeffwashington via 90e134d April 2, 2024 14:14

brooksprumo force-pushed the write-version/remove-producer branch from 72db69c to 90e134d Compare April 2, 2024 14:14

brooksprumo requested review from jeffwashington and HaoranYi April 2, 2024 14:15

jeffwashington approved these changes Apr 2, 2024

View reviewed changes

HaoranYi approved these changes Apr 2, 2024

View reviewed changes

brooksprumo merged commit 01855ed into anza-xyz:master Apr 2, 2024
38 checks passed

brooksprumo deleted the write-version/remove-producer branch April 2, 2024 15:27

This was referenced Apr 2, 2024

Removes write version from StorableAccounts #542

Merged

Removes write version from StorableAccountsWithHashesAndWriteVersions #561

Merged

brooksprumo mentioned this pull request Apr 11, 2024

Removes setting write version for geyser when restoring from a snapshot #702

Closed

HaoranYi mentioned this pull request Jun 5, 2024

v1.18: Archives storages directly (backport of #503) #1430

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sets write version to 0 when storing accounts #476

Sets write version to 0 when storing accounts #476

brooksprumo commented Mar 28, 2024 •

edited

Loading

brooksprumo commented Mar 28, 2024

codecov-commenter commented Mar 28, 2024

jeffwashington commented Mar 29, 2024

lijunwangs commented Mar 29, 2024 •

edited

Loading

HaoranYi commented Apr 1, 2024

brooksprumo commented Apr 1, 2024

HaoranYi commented Apr 1, 2024

brooksprumo commented Apr 1, 2024

brooksprumo commented Apr 1, 2024

HaoranYi commented Apr 1, 2024

brooksprumo Apr 1, 2024

brooksprumo Apr 1, 2024

jeffwashington left a comment

HaoranYi left a comment

brooksprumo commented Apr 2, 2024

jeffwashington left a comment

HaoranYi left a comment

brooksprumo commented Apr 2, 2024

brooksprumo commented Apr 2, 2024

Sets write version to 0 when storing accounts #476

Sets write version to 0 when storing accounts #476

Conversation

brooksprumo commented Mar 28, 2024 • edited Loading

Problem

Summary of Changes

brooksprumo commented Mar 28, 2024

codecov-commenter commented Mar 28, 2024

Codecov Report

jeffwashington commented Mar 29, 2024

lijunwangs commented Mar 29, 2024 • edited Loading

HaoranYi commented Apr 1, 2024

brooksprumo commented Apr 1, 2024

HaoranYi commented Apr 1, 2024

brooksprumo commented Apr 1, 2024

brooksprumo commented Apr 1, 2024

HaoranYi commented Apr 1, 2024

brooksprumo Apr 1, 2024

Choose a reason for hiding this comment

brooksprumo Apr 1, 2024

Choose a reason for hiding this comment

jeffwashington left a comment

Choose a reason for hiding this comment

HaoranYi left a comment

Choose a reason for hiding this comment

brooksprumo commented Apr 2, 2024

jeffwashington left a comment

Choose a reason for hiding this comment

HaoranYi left a comment

Choose a reason for hiding this comment

brooksprumo commented Apr 2, 2024

brooksprumo commented Apr 2, 2024

brooksprumo commented Mar 28, 2024 •

edited

Loading

lijunwangs commented Mar 29, 2024 •

edited

Loading