-
Notifications
You must be signed in to change notification settings - Fork 311
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change read-only-cache to be keyed on pubkey only #576
Change read-only-cache to be keyed on pubkey only #576
Conversation
|
i don't understand the 4 logs lines. What is the important data and difference? |
I'm concerned about getting the wrong data with various race conditions I could imagine. using the slot enforces that we are getting the right version of each account. We don't clear the read cache when we drop slots or forks. That's an issue for this scheme, I think. The index is updated correctly in those cases (I have to assume). |
In this PR, we are still checking the slot. The slot is stored along the 'AccountSharedData' in the value. When loading, we check both pubkey (from the map) and slot (from the value). |
read_only_accounts_cache_entries records the number of entries in the cache, i.e. distinct (pubkey, slot) pairs. Take |
8365e34
to
181a853
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #576 +/- ##
=======================================
Coverage 81.9% 81.9%
=======================================
Files 851 851
Lines 230475 230514 +39
=======================================
+ Hits 188785 188818 +33
- Misses 41690 41696 +6 |
so in theory, writing an account would evict the pubkey from the read cache completely. The theory must be that there are accounts that are flushed to append vecs, read to put in read cache, then written 100+ slots later, then flushed, then read from the newer slot and put in the read cache where now the account is in the read cache twice (or more)? |
Sure. I will run this experiment. |
Why doesn't this "written 100+ slots later" cause the account be removed from the read cache though? Is it because when we store to the write cache and remove from the read cache, we use the slot being stored to as a check into the read cache? Presumably this check in the read cache would always fail, and thus never remove old read cache entries. Maybe that's the crux of the change—that we can now correctly remove entries in the read cache that have been superseded by entries in the write cache? |
Yes. Writing account at slot 'x' won't remove cached entry of slot 'y'. So stale cache entry at slot 'y' will only be removed until it becomes LRU. The single pubkey ensures that we have at most one entry per pubkey. And it also helps reduce the numbers of evicts. The stale element from the same pubkey will be replaced when new updates are inserted, which saves the cost of evictions in the current code base (as shown in the 2nd graph on the right). Of course, evicts matter less now thanks to the background evicts 😉 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of this change!
Should there be new tests added that ensure
- storing in a new slot "removes" the old one?
- calling remove on a pubkey works regardless of its slot?
- calling load correctly checks slot?
Some (all?) of these may be covered by existing tests too.
Another aspect of this change is to help when we scale the cache size. |
5f1395d
to
21952ae
Compare
@@ -439,6 +450,10 @@ mod tests { | |||
cache.evict_in_foreground(); | |||
assert_eq!(100 + per_account_size, cache.data_size()); | |||
assert!(accounts_equal(&cache.load(key1, slot).unwrap(), &account1)); | |||
// pass a wrong slot and check that load fails | |||
assert!(cache.load(key1, slot + 1).is_none()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- calling load correctly checks slot?
Added coverage.
// pass a wrong slot and check that load fails | ||
assert!(cache.load(key1, slot + 1).is_none()); | ||
// insert another entry for slot+1, and assert only one entry for key1 is in the cache | ||
cache.store(key1, slot + 1, account1.clone()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- storing in a new slot "removes" the old one?
Added coverage.
Added.
Already covered
Added. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm! Makes the read cache more efficient, and thus improves the time to load accounts.
Got @jeffwashington's approval from slack. |
Problem
The way that readonly cache is keyed on (pubkey, slot) can lead to caching
stale entries and inefficient use of the cache.
By design, readonly cache is shadowed by write_cache for the fork tips. Because
of this, it is very unlikely that we need more than one entry per pubkey in the
cache. Extra entry in the cache is a waste since it is from stale slot and will
eventually be evicted over the time.
Also, the extra
slot
in the key increase the FIFO size for LRU.Summary of Changes
Rework readonly cache to be keyed on pubkey only.
Fixes #