Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

program-runtime: double program cache size #3481

Merged
merged 1 commit into from
Nov 6, 2024

Conversation

alessandrod
Copy link

The cache is currently getting thrashed and programs are getting reloaded pretty much at every single slot. Double the cache size, which makes reloading happen only due to random eviction sometimes picking a popular entry.

The JIT code size with the new cache size is about 800MB.

This change reduces JIT time 15x.

Left is with PR, right is master.

Screenshot 2024-11-05 at 8 46 59 pm

The cache is currently getting thrashed and programs are getting
reloaded pretty much at every single slot. Double the cache size, which
makes reloading happen only due to random eviction sometimes picking a
popular entry.

The JIT code size with the new cache size is about 800MB.

This change reduces jit time 15x.
@alessandrod alessandrod added the v2.1 Backport to v2.1 branch label Nov 5, 2024
Copy link

mergify bot commented Nov 5, 2024

Backports to the beta branch are to be avoided unless absolutely necessary for fixing bugs, security issues, and perf regressions. Changes intended for backport should be structured such that a minimum effective diff can be committed separately from any refactoring, plumbing, cleanup, etc that are not strictly necessary to achieve the goal. Any of the latter should go only into master and ride the normal stabilization schedule. Exceptions include CI/metrics changes, CLI improvements and documentation updates on a case by case basis.

Copy link

@bw-solana bw-solana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

800MB seems like a small price to pay. I'm seeing ~100ms per slot spent filtering/replenishing program cache over the last 7 days on mainnet.

@alessandrod
Copy link
Author

I'm seeing ~100ms per slot spent filtering/replenishing program cache over the last 7 days on mainnet.

Yep one odd thing I noticed is that our devboxes and canaries have about 2x worse mean program cache perf than mnb. I don't know if it's because we've got potatoes compared to mnb nodes, or if we have an actual regression in master

@Lichtso
Copy link

Lichtso commented Nov 5, 2024

or if we have an actual regression in master

Can you run a v2.0 node on your devbox for comparison?

@bw-solana
Copy link

bw-solana commented Nov 5, 2024

I don't know if it's because we've got potatoes compared to mnb nodes, or if we have an actual regression in master

Sample size of 1, but mds3Df1ieBonG2qS8ZoKTqshq5MgTUNfZgc78cjiCdq is running v2.1.1 on mainnet and looks significantly worse than cluster mean
image

@alessandrod alessandrod merged commit fb4adda into anza-xyz:master Nov 6, 2024
41 checks passed
mergify bot pushed a commit that referenced this pull request Nov 6, 2024
The cache is currently getting thrashed and programs are getting
reloaded pretty much at every single slot. Double the cache size, which
makes reloading happen only due to random eviction sometimes picking a
popular entry.

The JIT code size with the new cache size is about 800MB.

This change reduces jit time 15x.

(cherry picked from commit fb4adda)
@alessandrod alessandrod added the v2.0 Backport to v2.0 branch label Nov 6, 2024
Copy link

mergify bot commented Nov 6, 2024

Backports to the stable branch are to be avoided unless absolutely necessary for fixing bugs, security issues, and perf regressions. Changes intended for backport should be structured such that a minimum effective diff can be committed separately from any refactoring, plumbing, cleanup, etc that are not strictly necessary to achieve the goal. Any of the latter should go only into master and ride the normal stabilization schedule.

mergify bot pushed a commit that referenced this pull request Nov 6, 2024
The cache is currently getting thrashed and programs are getting
reloaded pretty much at every single slot. Double the cache size, which
makes reloading happen only due to random eviction sometimes picking a
popular entry.

The JIT code size with the new cache size is about 800MB.

This change reduces jit time 15x.

(cherry picked from commit fb4adda)
alessandrod added a commit that referenced this pull request Nov 8, 2024
…3492)

program-runtime: double program cache size (#3481)

The cache is currently getting thrashed and programs are getting
reloaded pretty much at every single slot. Double the cache size, which
makes reloading happen only due to random eviction sometimes picking a
popular entry.

The JIT code size with the new cache size is about 800MB.

This change reduces jit time 15x.

(cherry picked from commit fb4adda)

Co-authored-by: Alessandro Decina <[email protected]>
alessandrod added a commit that referenced this pull request Nov 8, 2024
The cache is currently getting thrashed and programs are getting
reloaded pretty much at every single slot. Double the cache size, which
makes reloading happen only due to random eviction sometimes picking a
popular entry.

The JIT code size with the new cache size is about 800MB.

This change reduces jit time 15x.

(cherry picked from commit fb4adda)
alessandrod added a commit that referenced this pull request Nov 8, 2024
The cache is currently getting thrashed and programs are getting
reloaded pretty much at every single slot. Double the cache size, which
makes reloading happen only due to random eviction sometimes picking a
popular entry.

The JIT code size with the new cache size is about 800MB.

This change reduces jit time 15x.

(cherry picked from commit fb4adda)
alessandrod added a commit that referenced this pull request Nov 9, 2024
…3494)

program-runtime: double program cache size (#3481)

The cache is currently getting thrashed and programs are getting
reloaded pretty much at every single slot. Double the cache size, which
makes reloading happen only due to random eviction sometimes picking a
popular entry.

The JIT code size with the new cache size is about 800MB.

This change reduces jit time 15x.

(cherry picked from commit fb4adda)

Co-authored-by: Alessandro Decina <[email protected]>
vovkman pushed a commit to helius-labs/agave that referenced this pull request Nov 13, 2024
The cache is currently getting thrashed and programs are getting
reloaded pretty much at every single slot. Double the cache size, which
makes reloading happen only due to random eviction sometimes picking a
popular entry.

The JIT code size with the new cache size is about 800MB.

This change reduces jit time 15x.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
v2.0 Backport to v2.0 branch v2.1 Backport to v2.1 branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants