Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize schedule.md #2349

Closed
wants to merge 31 commits into from
Closed

Optimize schedule.md #2349

wants to merge 31 commits into from

Conversation

dwightguth
Copy link
Collaborator

@dwightguth dwightguth commented Mar 12, 2024

This is a remake of #2335. That PR optimized the schedule lookup functions in schedule.md and deleted the DEFAULT schedule.

This PR takes a slightly different approach to get the same goal. We create a new sort, ScheduleTuple, which contains all of the values of all the schedule constants and schedule flags for a given schedule. We create an operator which computes a ScheduleTuple for a given Schedule via the existing code in schedule.md. Then we create getters for each individual member of the tuple (implicitly). Code that needs access to schedule data gets it by means of calling the getters on a value of sort ScheduleTuple from throughout the semantics, thus considerably simplifying the hot path while maintaining the same code to compute the actual values for the schedule.

Performance data:

On the master branch, we spend 35m21s executing the GeneralStateTests, of which 13% is spent in _<_>. On this branch, we spend 26m46s executing the GeneralStateTests, of which 2.5% is spent in ScheduleTuple getters.

@dwightguth dwightguth force-pushed the schedule3 branch 13 times, most recently from f21c767 to 6c18cc7 Compare March 19, 2024 22:07
dwightguth pushed a commit to runtimeverification/k that referenced this pull request Mar 21, 2024
…ich are total (#4118)

As part of
runtimeverification/evm-semantics#2349 we are
trying to change how we implement the gas schedule in KEVM in order to
improve performance. In order to do this, we introduce a new sort,
ScheduleTuple, which has a single constructor with many arguments. We
rely on projection functions for named nonterminals in order to extract
those arguments. However, this confuses the booster because it is unable
to determine that these functions are total. In general, they are not
total, however, for the case where the sort they are defined over has a
single constructor, they are total. Thus, we introduce logic into
GenerateSortProjections which tags these productions as total in the
cases where they are in fact total.

This has been tested and shown to fix the performance regression in the
booster introduced by the above change. It should be easy to see why
this change is sound since it applies the `total` attribute only in the
case where the production has a single constructor, and the rule matches
unconditionally on any instance of that constructor.
@dwightguth dwightguth force-pushed the schedule3 branch 4 times, most recently from 52b5541 to 77a75e6 Compare March 27, 2024 16:51
@dwightguth dwightguth marked this pull request as ready for review March 28, 2024 16:14
@dwightguth dwightguth requested review from anvacaru and ehildenb March 28, 2024 16:14
@ehildenb
Copy link
Member

@dwightguth please post performance information about this vs master if it's supposed to be an optimization.

@geo2a
Copy link
Contributor

geo2a commented Apr 2, 2024

I'd be great to ensure this change does not hinder the performance of symbolic execution. We run the test-prove-pyk Makefile target on every signification change to HB/Booster, producing tables that look like this one. @dwightguth could you run the test-prove-pyk tests on master and this branch to compare the performance? We have a Nix-based script that makes the two runs varying the Booster: https://github.com/runtimeverification/hs-backend-booster/blob/main/scripts/performance-tests-kevm.sh. Is could be easily changed to vary the KEVM commit, pinning the Booster to the version from KEVM master.

It's worth noting that the test should be ideally run with 1 worker (i.e. passing PYTEST_PARALLEL=1) to exclude the noise from the K frontend compiling the specs and spawning a bunch of Z3s.

@ehildenb
Copy link
Member

ehildenb commented Apr 2, 2024

Yes please, we sohuld check that it does not cause a regression in symbolic execution performance, using the same tables/techniques taht the HBB team uses.

@dwightguth
Copy link
Collaborator Author

Test schedule3 time master time (schedule3/master) time
erc20/ds/transferFrom-failure-1-d-spec.k 84.85 89.64 0.9465640339134315
erc20/ds/transferFrom-failure-1-b-spec.k 184.32 191.11 0.9644707236669979
mcd/cat-exhaustiveness-spec.k 158.8 153.42 1.0350671359666277
erc20/ds/transfer-success-2-spec.k 84.43 81.51 1.0358238252975096
mcd/dsvalue-peek-pass-rough-spec.k 70.24 67.81 1.0358354225040554
bihu/forwardToHotWallet-success-2-spec.k 65.59 63.23 1.037324055037166
benchmarks/dynamicarray00-spec.k 55.28 53.28 1.0375375375375375
kontrol/test-emitcontracttest-testexpectemitdonotcheckdata-0-spec.k 62.21 59.95 1.037698081734779
erc20/ds/balanceOf-spec.k 57.71 55.61 1.0377629922675777
bihu/forwardToHotWallet-failure-4-spec.k 132.72 127.83 1.0382539310021122
erc20/hkg/approve-spec.k 55.62 53.52 1.0392376681614348
benchmarks/ecrecover00-sigvalid-spec.k 81.98 78.87 1.0394319766704705
benchmarks/storagevar02-overflow-spec.k 63.13 60.72 1.0396903820816865
mcd/flipper-ttl-pass-spec.k 73.3 70.43 1.0407496805338632
erc20/ds/transfer-success-1-spec.k 91.64 87.98 1.041600363719027
mcd/vat-addui-pass-spec.k 178.47 171.33 1.0416739625284537
benchmarks/address00-spec.k 51.97 49.88 1.0419005613472332
mcd/vat-subui-fail-rough-spec.k 95.48 91.63 1.0420168067226891
benchmarks/keccak00-spec.k 59.62 57.19 1.0424899457947194
kontrol/test-emitcontracttest-testexpectemitcheckemitter-0-spec.k 61.98 59.43 1.0429076224129228
erc20/ds/allowance-spec.k 62.33 59.76 1.0430053547523428
kontrol/test-allowchangestest-testallow-0-spec.k 64.16 61.46 1.0439310120403513
erc20/hkg/totalSupply-spec.k 49.45 47.36 1.0441300675675675
benchmarks/storagevar02-nooverflow-spec.k 63.9 61.18 1.0444589735207583
kontrol/test-expectcalltest-testexpectregularcall-0-spec.k 62.21 59.56 1.0444929482874412
kontrol/test-arithmetictest-test_wdiv_rounding-uint256-uint256-0-spec.k 73.6 70.45 1.0447125621007807
mcd/dstoken-approve-fail-rough-spec.k 129.91 124.33 1.0448805598005309
mcd/vat-subui-pass-rough-spec.k 172.57 165.15 1.0449288525582803
mcd/dai-symbol-pass-spec.k 66.37 63.51 1.0450322783813573
kontrol/test-expectreverttest-test_expectrevert_message-0-spec.k 77.21 73.87 1.0452145661296872
kontrol/test-expectreverttest-test_expectrevert_bytes4-0-spec.k 163.84 156.67 1.0457649837237506
benchmarks/ecrecover00-siginvalid-spec.k 78.03 74.61 1.04583835946924
kontrol/test-expectcalltest-testexpectstaticcall-0-spec.k 62.21 59.47 1.0460736505801245
benchmarks/requires01-a0le0-spec.k 112.49 107.47 1.0467107099655717
mcd/dsvalue-read-pass-summarize-spec.k 65.84 62.9 1.0467408585055644
benchmarks/storagevar03-spec.k 56.19 53.65 1.0473438956197576
erc20/hkg/transfer-failure-1-spec.k 74.85 71.44 1.0477323628219484
erc20/hkg/allowance-spec.k 54.28 51.79 1.0480787796871984
mcd/dsvalue-read-pass-spec.k 73.14 69.76 1.0484518348623852
benchmarks/bytes00-spec.k 63.4 60.46 1.048627191531591
kontrol/test-allowchangestest-testfailallowcallstoaddress-0-spec.k 64.29 61.27 1.0492900277460422
kontrol/test-expectreverttest-testfail_expectrevert_false-0-spec.k 106.94 101.91 1.0493572760278678
kontrol/test-storetest-teststoreload-0-spec.k 64.14 61.07 1.0502701817586377
examples/erc20-spec.md 301.55 287.0 1.0506968641114982
bihu/forwardToHotWallet-failure-3-spec.k 100.14 95.13 1.0526647745190791
benchmarks/structarg01-spec.k 70.03 66.51 1.0529243722748458
kontrol/test-arithmetictest-test_max1_broken-uint256-uint256-0-spec.k 59.46 56.42 1.053881602268699
examples/solidity-code-spec.md 218.88 207.6 1.0543352601156069
kontrol/test-allowchangestest-testfailallowchangestostorage-0-spec.k 96.2 90.94 1.0578403342863427
mcd/vat-deny-diff-fail-rough-spec.k 127.16 120.11 1.0586961951544418
kontrol/test-storetest-testaccesses-0-spec.k 64.67 61.0 1.0601639344262295
erc20/hkg/transfer-success-2-spec.k 62.25 58.67 1.061019260269303
benchmarks/storagevar00-spec.k 54.16 50.97 1.0625858348047872
benchmarks/encode-keccak00-spec.k 62.58 58.89 1.0626591951095261
kontrol/test-expectreverttest-testfail_expectrevert_failandsuccess-0-spec.k 76.15 71.59 1.0636960469339294
kontrol/test-allowchangestest-testallow_fail-0-spec.k 64.7 60.81 1.06396974181878
mcd/cat-file-addr-pass-rough-spec.k 71.68 67.27 1.065556711758585
benchmarks/structarg00-spec.k 61.68 57.84 1.066390041493776
erc20/hkg/transfer-failure-2-spec.k 72.53 67.91 1.0680312177882492
kontrol/test-arithmetictest-test_wmul_weakly_increasing_positive-uint256-uint256-0-spec.k 96.7 90.5 1.0685082872928178
benchmarks/storagevar01-spec.k 60.2 56.34 1.068512602058928
benchmarks/requires01-a0gt0-spec.k 55.99 52.05 1.0756964457252642
kontrol/test-owneruponlytest-testincrementasowner-0-spec.k 99.86 91.96 1.0859069160504569
kontrol/test-countertest-testincrement-0-spec.k 129.25 118.67 1.0891547990224995
TOTAL 5636.509999999999 5391.650000000002 1.0454146689788835

@dwightguth
Copy link
Collaborator Author

You requested the performance data regarding the impact this change has on the Haskell Backend. I ran the data, and it is attached. As you can see, we see on average about a 5% slowdown in the performance of the Haskell backend on these examples.

Performance is a critical concern for Pi Squared because we need to execute EVM transactions with a level of performance within an order of magnitude of the performance of competitive EVM clients like Geth and Nethermind. Right now, 13% of the time spent executing KEVM transactions is spent looking up the gas costs of instructions, as compared to 2.5% with this change. It is critical that some version of this performance improvement be included in the EVM semantics so Pi Squared can move forward with development.

We spent a significant amount of time iterating on this pull request. This is the third version of the pull request because the first two were rejected. We’ve spent the last two weeks attempting to construct a version of these changes that minimizes the disruption tor RV. This version of the pull request exists as a result of those iterations. It is a compromise between performance of concrete execution, performance on the Haskell backend, readability, and the degree to which it disrupts verification. I’m happy to go back to the original version of these changes, which optimizes for performance and lack of disruption at the expense of readability, but we do need some form of optimization to this cost center to be included in the change.

We don’t know why this change reduces the performance of the Haskell backend. It doesn’t entirely make sense that it should, because from a purely theoretic standpoint, the semantics is now doing less work than it did before. We think it may be related to the implementation of the pattern matching algorithm but a definitive answer will require investigation. We suggest that we merge this change and, if you remain concerned about the performance impact, you can commence an investigation into why this change makes the Haskell backend slower.

We welcome constructive suggestions on how to proceed, but some form of this change does need to be made in a timely manner.

@dwightguth dwightguth requested a review from anvacaru April 5, 2024 18:38
Copy link
Contributor

@anvacaru anvacaru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm approving this PR. Is it possible that the slowdown is caused by the increased size of the configuration?

@@ -35,6 +35,7 @@ In the comments next to each cell, we've marked which component of the YellowPap
<exit-code exit=""> 1 </exit-code>
<mode> $MODE:Mode </mode>
<schedule> $SCHEDULE:Schedule </schedule>
<scheduleTuple> getSchedule($SCHEDULE:Schedule) </scheduleTuple>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we drop the <schedule> cell and have only the <scheduleTuple>?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. Schedule is still used in some places, for example when calling precompiled contracts.

@dwightguth dwightguth closed this Apr 8, 2024
@dwightguth
Copy link
Collaborator Author

Closing this. Pi^2 is pursuing a different approach for right now. We will sync back up with you in the future about this change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants