-
-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Torus-acceleration for multiexponentiation on GT #485
Conversation
…2 over Fp4 doesn't (without Torus)
Why not use Ristretto or similar for Whisk? We've a similar issue in the ring VRFs used by Sassafras on Polkadot, since my optimized design has VUF outputs on G1, but then you could move them onto a "sister curve", ala cfrg/draft-irtf-cfrg-bls-signature#30. Alistair later noticed a Plonk-ish ring VRF is plenty fast enough for our validator set size, which used VUF outputs on bandersnatch, so no requirement. I suppose a Risttretto bullet-proof ring VRF maybe even faster for us too. Afaik none of that helps you, not without replacing Whick by Sassafras. |
because as highlighted in https://ethresear.ch/t/the-return-of-torus-based-cryptography-whisk-and-curdleproof-in-the-target-group/16678 we want to take the easy way and use the validator’s secret signing key k and its associated public key kG1 for bootstrapping. |
This project was sponsored by the Ethereum Foundation under the name
Implementing & Accelerating Torus-based cryptography for SSLE, FY24-1672
and is a collaboration with Robert Granger (https://www.surrey.ac.uk/people/robert-granger) and Antonio Sanso (@asanso).
Overview
Currently, Ethereum validators are known up to 6.4 minutes in advance (1 Epoch = 32 slots = 32*12 seconds).
In theory a malicious actor may do denial-of-service attacks on upcoming validators to prevent block production.
To prevent this a Single Secret Leader Election protocol (SSLE) may be used to keep the identity of a block producer under wrap.
The protocol identified for Ethereum is Whisk:
However it was initially instantiated on elliptic curve groups (G1 or G2) and pairings could unravel the whole scheme (similar to I guess the MOV attack https://crypto.stanford.edu/pbc/notes/elliptic/movattack.html ).
To prevent this attack vector, the scheme can be instantiated on the pairing group GT.
This means computations on Fp12 instead of G1 (2x Fp), meaning we should expect computations to be 6x more expensive.
Full overview: https://crypto.stanford.edu/pbc/notes/elliptic/movattack.html
The goal of this PR is to reduce the overhead of GT multiexponentiation from the initial 5x https://ethresear.ch/t/the-return-of-torus-based-cryptography-whisk-and-curdleproof-in-the-target-group/16678/3, ideally to 3x to make Whisk viable latency-wise.
Implementation
Future work
sectionNote: the Torus acceleration as implemented seems to be only valid for curve with 1+𝑖 as sextic non-residue, with 𝑖 = √-1. This is the case for BLS12-381 and BN254-Nogami but not for BN254-Snarks (Zcash, Ethereum) or BLS12-377 (Aleo)
Benchmarks
We compare to BLST MSM G1, BLST is used by all consensus clients today, benchmarks from status-im/nim-blscurve#183.
The multiexp size is expected to be 128 or 256. We implement using 2 towering schemes. Note that Torus acceleration is only valid for Fp12 over Fp6 over Fp2
Machine is a Ryzen 9 9950XE, overclocked at 5.9GHz single-threaded
BLST MSM G1
Constantine MEXP GT
For 128 points, the ratio G1/GT is only 3x.
For 256 points, the ratio G1/GT is 3.28x.
Future work
I have been investigating some performance bugs in the summer where Constantine starts with a 1.7x perf advantage in field arithmetic that get reduced to 1x at the elliptic curve level or even worse after Fp2 -> Fp6 -> Fp12 towering:
Solving this might reduce the gap to 2.4x (20% improvement)
Another line of work is to use SIMD for multiexp to compute on 4x 64-bit integers (AVX2) or 8x 64-bit integers (AVX512) per instruction which should conservatively bring at least a 2x / 4x perf improvement respectively.