This PoC is a joint collaboration by:
Two bugs were identified in the sp1-sdk
crate at version 3.4.0. We were able to exploit them to generate valid SP1 proofs of incorrect
execution of an arbitrary program, which results in universal forgeries of
proofs for arbitrary statements, even incorrect ones.
This report describes two bugs:
- SP1-2.1: Unconstrained
committed_value_digest
withoutCOMMIT
syscalls - SP1-2.2: Unchecked
next_pc
Condition in First Layer of Recursion Tree
When an SP1 executor executes a guest program, the COMMIT
syscalls
(0x00_00_00_10
) resulting from calls to sp1_zkvm::io::commit()
are delayed
to the end of the execution and emitted only when the main
function returns.
Since these syscalls are the only events that generate constraints on the
committed_value_digest
of the ExecutionRecord
, This implies that if the
execution of the program is halted before the return point of main()
, then the
COMMIT
syscalls are never issued. As a result, the committed_value_digest
remains initialised to the all-zero value and is not constrained during proof
generation.
This absence of constraints on the digest of committed values during the
execution of the program raises further questions on the compatibility of the
verifier code of sp1-sdk
with proofs of "arbitrary" program, but exploring
this falls outside of the scope of this report.
In the verifying code for the first layer of the recursion tree, the condition
next_pc == 0
is not checked if a shard is indicated as containing a
"complete" execution. However, this check is present in other recursion
constraints (e.g., two-to-one proof compression, field-switch wrapping,
proof-system switching).
Other checks exist that are performed on other code paths when is_complete
is
asserted but are missing from the first layer of the recursion tree; these may
be critical to proof soundness, but were not used in our exploit. These can be
found in sp1_recursion_circuit::machine::complete::assert_complete
(for
the recursive constraints) and sp1_prover::verify::verify
(for the plain,
uncompressed verification). These checks include whether:
- at least one execution shard is present
- (execution) shard numbering is consecutive, and starts at one,
- the leaf challenger and the final reconstruct challenger match
- the deferred proof digests are consistent, and
- the cumulative sum is consistent.
Initially, an attempt to exploit
SP1-2.1
was made by making an explicit HALT
(0x00_00_00_00
) syscall within main()
.
While this is not the methodology that was ultimately used, we note that making
such explicit HALT
syscalls within main might reasonably be passed off as an
optimization within the guest program, e.g., by arguing that an early halt
within main is a way to shorten a program's execution trace and therefore reduce
its proof computation time. This would seem innocent enough, but
SP1-2.1
would then imply that the digest of public values produced before the syscall
would remain unconstrained.
However, it suffices to instead create a malicious SP1 executor which stops
executing the guest program at an arbitrary pc
value. As long as the chosen
pc
value happens before the return point of main()
, no COMMIT
syscall will
have been produced by the virtual machine. In the exploit presented here, the
proof forgery was produced by stopping execution as soon as the program reached
the start of the main
function.
Before returning the resulting ExecutionRecord
, the malicious executor is then
free to replace the committed_value_digest
of the public_values: PublicValues
field with the digest of an arbitrary value. This makes it seem as
if the guest program had committed to it during its execution.
Then, an honest CoreProver
(sp1_prover::components::SP1ProverComponents::CoreProver
) is run to generate
an SP1CoreProof
(sp1_prover::types::SP1CoreProof
) with the maliciously
crafted ExecutionRecord
. Since there are no COMMIT
syscalls contained within
the record, the altered committed_value_digest
, which does not match the
digest of the values committed to by the program, does not cause the proof
generation to fail. The CoreProver
therefore successfully creates an
SP1CoreProof
containing two shards s1
and s2
.
Finally, a malicious SP1Prover
(sp1_prover::SP1Prover
) is created with the
following two modifications:
- The second shard
s2
of theSP1CoreProof
is discarded. - The
is_complete
flag is set totrue
in theSP1RecursionWitnessValues
created from the remaining first shards1
. This recursion witness is then used when generating the recursion program for compressing theSP1CoreProof
with aCompressProver
.
Because the malicious executor stopped the guest program with a next_pc
value
pointing to the start of main, the first shard s1
has a non-zero next_pc
value. However, since the is_complete
flag is true
,
SP1-2.2
implies that the CompressProver
does not constrain the equality next_pc == 0
, resulting in an SP1ReduceProof
proof generated without errors. When
de-serialized by an honest prover and submitted for verification, this
SP1ReduceProof
then passes honest verification for the guest program.
This bug report is accompanied by two artefacts that demonstrate how the two bugs can be exploited to provide valid proofs of invalid execution of arbitrary programs. You need to change the forger.md to forger.diff and use it to change the prover client.
The is-prime directory contains the source files and compiled ELF
version of a program which checks primality of a number
read from sp1_zkvm::io
, together with a script which
demonstrates that 42-is-prime.proof is a
valid proof that executing the program results in 42 being verified as a prime
number.
The i-am-satoshi directory contains a example of the
transferability of this technique. Here, the guest program uses the independent
bitcoin
crate to compute the Bitcoin address corresponding to the secret key
given as input, and then commits to the resulting address with
sp1_zkvm::io::commit()
. Running the corresponding verifier
program reveals that the proof demonstrates
knowledge of the secret key behind the "genesis Satoshi address"
1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa
.
While the first program presented in this exploit is innocent enough (42 is obviously not prime), the second one exemplifies more serious consequences. Proof of ownership of any address on any blockchain can be forged and would be accepted by a naïve verifier. We draw the attention to the fact that the program itself was not modified. All of the modifications to create the proof forgery were performed locally to the proving client which implies that this forgery methodology is generalisable to arbitrary guest programs.
We note that
SP1-2.2
is a bug that is restricted to the first layer of the recursion tree, and that
subsequent recursive proving of the proof would fail, because the ShrinkProver
properly constrains the next_pc == 0
check when is_complete == true
.
Nonetheless, the honest verifier code
is agnostic to the kind of proof that it deserializes which implies that the
malicious prover is not forced to further reduce or wrap the forged proof before
submitting it for verification to a system that runs the Rust verifier from the
sp1-sdk
crate.
Mitigating SP1-2.1 requires making sure that the committed_value_digest
is
constrained within the proof system, even if no COMMIT
syscalls are made.
Any implementation of this mitigation would conflict with the current
implementation which assumes that committed_value_digest
is written to only
once. A concrete proposal requires further exploration of the possibilities
which is outside of the scope of this report.
Ultimately, the code for the recursion program should be patched so that the
next_pc == 0
constraint (and the other related constraints for complete
programs) is applied by the CompressProver
in the first layer of the recursion
tree, just like it is in the other recursion programs.
As a hot fix which would not be a breaking change to currently accepted
verification keys, since the next_pc
value is accessible as part of a proof's
public values, the Rust code of the verifier should check that it is 0
, even
if this isn't constrained within the proof system. This hot fix would further
enable existing verifiers to check whether this bug was triggered by proofs that
are still in storage.