diff --git a/crates/l2/docs/README.md b/crates/l2/docs/README.md index d92f09df69..d39a88343b 100644 --- a/crates/l2/docs/README.md +++ b/crates/l2/docs/README.md @@ -6,9 +6,10 @@ For a high level overview of the L2: For more detailed documentation on each part of the system: +- [Contracts](./contracts.md) +- [Execution program](./program.md) - [Proposer](./proposer.md) - [Prover](./prover.md) -- [Contracts](./contracts.md) ## Configuration diff --git a/crates/l2/docs/program.md b/crates/l2/docs/program.md new file mode 100644 index 0000000000..35a92ba9fb --- /dev/null +++ b/crates/l2/docs/program.md @@ -0,0 +1,43 @@ +# Prover's block execution program + +The zkVM block execution program will: +1. Take as input: + - the block to verify and its parent's header + - the L2 initial state, stored in a `ExecutionDB` struct, including the nodes for state and storage [pruned tries](#pruned-tries) +1. Build the initial state tries. This includes: + - verifying that the initial state values stored in the `ExecutionDB` are included in the tries. + - checking that the state trie root hash is the same as the one in the parent's header + - building the trie structures +1. Execute the block +1. Perform validations before and after execution +1. Apply account updates to the tries and compute the new state root +1. Check that the final state root is the same as the one stored in the block's header +1. Commit the program's output + +## Public and private inputs +The program interface defines a `ProgramInput` and `ProgramOutput` structures. + +`ProgramInput` contains: +- the block to verify and its parent's header +- an `ExecutionDB` which only holds the relevant initial state data for executing the block. This is built from pre-executing the block outside the zkVM to get the resulting account updates and retrieving the accounts and storage values touched by the execution. +- the `ExecutionDB` will also include all the (encoded) nodes necessary to build [pruned tries](#pruned-tries) for the stored accounts and storage values. + +`ProgramOutput` contains: +- the initial state hash +- the final state hash +these outputs will be committed as part of the proof. Both hashes are verified by the program, with the initial hash being checked at the time of building the initial tries (equivalent to verifying inclusion proofs) and the final hash by applying the account updates (that resulted from the block's execution) in the tries and recomputing the state root. + +## Pruned Tries +The EVM state is stored in Merkle Patricia Tries, which work differently than standard Merkle binary trees. In particular we have a *state trie* for each block, which contains all account states, and then for each account we have a *storage trie* that contains every storage value if the account in question corresponds to a deployed smart contract. + +We need a way to check the integrity of the account and storage values we pass as input to the block execution program. The "Merkle" in Merkle Patricia Tries means that we can cryptographically check inclusion of any value in a trie, and then use the trie's root to check the integrity of the whole data at once. + +Particularly, the root node points to its child nodes by storing their hashes, and these also contain the hashes of *their* child nodes, and so and so, until arriving at nodes that contain the values themselves. This means that the root contains the information of the whole trie (which can be compressed in a single word (32 byte value) by hashing the root), and by traversing down the trie we are checking nodes with more specific information until arriving to some value. + +So if we store only the necessary nodes that make up a path from the root into a particular value of interest (including the latter and the former), then: +- we know the root hash of this trie +- we know that this trie includes the value we're interested in +thereby **we're storing a proof of inclusion of the value in a trie with some root hash we can check*, which is equivalent to having a "pruned trie" that only contains the path of interest, but contains information of all other non included nodes and paths (subtries) thanks to nodes storing their childs hashes as mentioned earlier. This way we can verify the inclusion of values in some state, and thus the validity of the initial state values in the `ExecutionDB`, because we know the correct root hash. + +We can mutate this pruned trie by modifying/removing some value or inserting a new one, and then recalculate all the hashes from the node we inserted/modified up to the root, finally computing the new root hash. Because we know the correct final state root hash, this way we can make sure that the execution lead to the correct final state values. + diff --git a/crates/l2/proposer/prover_server.rs b/crates/l2/proposer/prover_server.rs index f69b76ee26..21a2937fb0 100644 --- a/crates/l2/proposer/prover_server.rs +++ b/crates/l2/proposer/prover_server.rs @@ -25,9 +25,9 @@ use risc0_zkvm::sha::{Digest, Digestible}; #[derive(Debug, Serialize, Deserialize, Default)] pub struct ProverInputData { - pub db: ExecutionDB, pub block: Block, - pub parent_header: BlockHeader, + pub parent_block_header: BlockHeader, + pub db: ExecutionDB, } use crate::utils::{ @@ -408,7 +408,7 @@ impl ProverServer { let db = ExecutionDB::from_exec(&block, &self.store).map_err(|err| err.to_string())?; - let parent_header = self + let parent_block_header = self .store .get_block_header_by_hash(block.header.parent_hash) .map_err(|err| err.to_string())? @@ -419,7 +419,7 @@ impl ProverServer { Ok(ProverInputData { db, block, - parent_header, + parent_block_header, }) } diff --git a/crates/l2/prover/Makefile b/crates/l2/prover/Makefile index 468e14c7a1..280d07e1f2 100644 --- a/crates/l2/prover/Makefile +++ b/crates/l2/prover/Makefile @@ -1,5 +1,5 @@ RISC0_DEV_MODE?=1 -RUST_LOG?="debug" +RUST_LOG?="info" perf_test_proving: @echo "Using RISC0_DEV_MODE: ${RISC0_DEV_MODE}" RISC0_DEV_MODE=${RISC0_DEV_MODE} RUST_LOG=${RUST_LOG} cargo test --release --test perf_zkvm --features build_zkvm -- --show-output diff --git a/crates/l2/prover/src/prover.rs b/crates/l2/prover/src/prover.rs index 45328188d8..24ff09a259 100644 --- a/crates/l2/prover/src/prover.rs +++ b/crates/l2/prover/src/prover.rs @@ -1,34 +1,19 @@ -use serde::Deserialize; use tracing::info; // risc0 -use zkvm_interface::methods::{ZKVM_PROGRAM_ELF, ZKVM_PROGRAM_ID}; - -use risc0_zkvm::{default_prover, ExecutorEnv, ExecutorEnvBuilder, ProverOpts}; - -use ethrex_core::types::Receipt; -use ethrex_l2::{ - proposer::prover_server::ProverInputData, utils::config::prover_client::ProverClientConfig, +use zkvm_interface::{ + io::{ProgramInput, ProgramOutput}, + methods::{ZKVM_PROGRAM_ELF, ZKVM_PROGRAM_ID}, }; -use ethrex_rlp::encode::RLPEncode; -use ethrex_vm::execution_db::ExecutionDB; -// The order of variables in this structure should match the order in which they were -// committed in the zkVM, with each variable represented by a field. -#[derive(Debug, Deserialize)] -pub struct ProverOutputData { - /// It is rlp encoded, it has to be decoded. - /// Block::decode(&prover_output_data.block).unwrap()); - pub _block: Vec, - pub _execution_db: ExecutionDB, - pub _parent_block_header: Vec, - pub block_receipts: Vec, -} +use risc0_zkvm::{default_prover, ExecutorEnv, ProverOpts}; + +use ethrex_l2::utils::config::prover_client::ProverClientConfig; pub struct Prover<'a> { - env_builder: ExecutorEnvBuilder<'a>, elf: &'a [u8], pub id: [u32; 8], + pub stdout: Vec, } impl<'a> Default for Prover<'a> { @@ -41,29 +26,20 @@ impl<'a> Default for Prover<'a> { impl<'a> Prover<'a> { pub fn new() -> Self { Self { - env_builder: ExecutorEnv::builder(), elf: ZKVM_PROGRAM_ELF, id: ZKVM_PROGRAM_ID, + stdout: Vec::new(), } } - pub fn set_input(&mut self, input: ProverInputData) -> &mut Self { - let head_block_rlp = input.block.encode_to_vec(); - let parent_header_rlp = input.parent_header.encode_to_vec(); - - // We should pass the inputs as a whole struct - self.env_builder.write(&head_block_rlp).unwrap(); - self.env_builder.write(&input.db).unwrap(); - self.env_builder.write(&parent_header_rlp).unwrap(); - - self - } - - /// Example: - /// let prover = Prover::new(); - /// let proof = prover.set_input(inputs).prove().unwrap(); - pub fn prove(&mut self) -> Result> { - let env = self.env_builder.build()?; + pub fn prove( + &mut self, + input: ProgramInput, + ) -> Result> { + let env = ExecutorEnv::builder() + .stdout(&mut self.stdout) + .write(&input)? + .build()?; // Generate the Receipt let prover = default_prover(); @@ -72,7 +48,7 @@ impl<'a> Prover<'a> { // This struct contains the receipt along with statistics about execution of the guest let prove_info = prover.prove_with_opts(env, self.elf, &ProverOpts::groth16())?; - // extract the receipt. + // Extract the receipt. let receipt = prove_info.receipt; info!("Successfully generated execution receipt."); @@ -87,8 +63,7 @@ impl<'a> Prover<'a> { pub fn get_commitment( receipt: &risc0_zkvm::Receipt, - ) -> Result> { - let commitment: ProverOutputData = receipt.journal.decode()?; - Ok(commitment) + ) -> Result> { + Ok(receipt.journal.decode()?) } } diff --git a/crates/l2/prover/src/prover_client.rs b/crates/l2/prover/src/prover_client.rs index 8c297109cf..be1e0c6e75 100644 --- a/crates/l2/prover/src/prover_client.rs +++ b/crates/l2/prover/src/prover_client.rs @@ -7,9 +7,10 @@ use std::{ use tokio::time::sleep; use tracing::{debug, error, info, warn}; +use zkvm_interface::io::ProgramInput; + use ethrex_l2::{ - proposer::prover_server::{ProofData, ProverInputData}, - utils::config::prover_client::ProverClientConfig, + proposer::prover_server::ProofData, utils::config::prover_client::ProverClientConfig, }; use super::prover::Prover; @@ -38,7 +39,7 @@ impl ProverClient { loop { match self.request_new_input() { Ok((block_number, input)) => { - match prover.set_input(input).prove() { + match prover.prove(input) { Ok(proof) => { if let Err(e) = self.submit_proof(block_number, proof, prover.id.to_vec()) @@ -58,7 +59,7 @@ impl ProverClient { } } - fn request_new_input(&self) -> Result<(u64, ProverInputData), String> { + fn request_new_input(&self) -> Result<(u64, ProgramInput), String> { // Request the input with the correct block_number let request = ProofData::Request; let response = connect_to_prover_server_wr(&self.prover_server_endpoint, &request) @@ -71,7 +72,11 @@ impl ProverClient { } => match (block_number, input) { (Some(n), Some(i)) => { info!("Received Response for block_number: {n}"); - Ok((n, i)) + Ok((n, ProgramInput { + block: i.block, + parent_block_header: i.parent_block_header, + db: i.db + })) } _ => Err( "Received Empty Response, meaning that the ProverServer doesn't have blocks to prove.\nThe Prover may be advancing faster than the Proposer." diff --git a/crates/l2/prover/tests/perf_zkvm.rs b/crates/l2/prover/tests/perf_zkvm.rs index 27299a842b..558a1c0f88 100644 --- a/crates/l2/prover/tests/perf_zkvm.rs +++ b/crates/l2/prover/tests/perf_zkvm.rs @@ -2,16 +2,16 @@ use std::path::Path; use tracing::info; use ethrex_blockchain::add_block; -use ethrex_l2::proposer::prover_server::ProverInputData; use ethrex_prover_lib::prover::Prover; use ethrex_storage::{EngineType, Store}; use ethrex_vm::execution_db::ExecutionDB; +use zkvm_interface::io::ProgramInput; #[tokio::test] async fn test_performance_zkvm() { tracing_subscriber::fmt::init(); - let mut path = Path::new(concat!(env!("CARGO_MANIFEST_DIR"), "/../../../test_data")); + let path = Path::new(concat!(env!("CARGO_MANIFEST_DIR"), "/../../../test_data")); // Another use is genesis-execution-api.json in conjunction with chain.rlp(20 blocks not too loaded). let genesis_file_path = path.join("genesis-l2-old.json"); @@ -34,23 +34,22 @@ async fn test_performance_zkvm() { let db = ExecutionDB::from_exec(block_to_prove, &store).unwrap(); - let parent_header = store + let parent_block_header = store .get_block_header_by_hash(block_to_prove.header.parent_hash) .unwrap() .unwrap(); - let input = ProverInputData { - db, + let input = ProgramInput { block: block_to_prove.clone(), - parent_header, + parent_block_header, + db, }; let mut prover = Prover::new(); - prover.set_input(input); let start = std::time::Instant::now(); - let receipt = prover.prove().unwrap(); + let receipt = prover.prove(input).unwrap(); let duration = start.elapsed(); info!( @@ -62,12 +61,5 @@ async fn test_performance_zkvm() { prover.verify(&receipt).unwrap(); - let output = Prover::get_commitment(&receipt).unwrap(); - - let execution_cumulative_gas_used = output.block_receipts.last().unwrap().cumulative_gas_used; - info!("Cumulative Gas Used {execution_cumulative_gas_used}"); - - let gas_per_second = execution_cumulative_gas_used as f64 / duration.as_secs_f64(); - - info!("Gas per Second: {}", gas_per_second); + let _program_output = Prover::get_commitment(&receipt).unwrap(); } diff --git a/crates/l2/prover/zkvm/interface/Cargo.toml b/crates/l2/prover/zkvm/interface/Cargo.toml index 3e89bdf335..165de49966 100644 --- a/crates/l2/prover/zkvm/interface/Cargo.toml +++ b/crates/l2/prover/zkvm/interface/Cargo.toml @@ -4,17 +4,15 @@ version = "0.1.0" edition = "2021" [dependencies] -serde = { version = "1.0", default-features = false, features = ["derive"] } -thiserror = "1.0.64" +serde = { version = "1.0.203", features = ["derive"] } +serde_with = "3.11.0" +thiserror = "1.0.61" -ethrex-storage = { path = "../../../../storage/store" } - -# revm -revm = { version = "14.0.3", features = [ - "std", - "serde", - "kzg-rs", -], default-features = false } +ethrex-core = { path = "../../../../common/", default-features = false } +ethrex-vm = { path = "../../../../vm", default-features = false } +ethrex-rlp = { path = "../../../../common/rlp", default-features = false } +ethrex-storage = { path = "../../../../storage/store", default-features = false } +ethrex-trie = { path = "../../../../storage/trie", default-features = false } [build-dependencies] risc0-build = { version = "1.1.2" } diff --git a/crates/l2/prover/zkvm/interface/guest/Cargo.toml b/crates/l2/prover/zkvm/interface/guest/Cargo.toml index 19a0c440ab..86d1c2c8f6 100644 --- a/crates/l2/prover/zkvm/interface/guest/Cargo.toml +++ b/crates/l2/prover/zkvm/interface/guest/Cargo.toml @@ -7,10 +7,13 @@ edition = "2021" [dependencies] risc0-zkvm = { version = "1.1.2", default-features = false, features = ["std"] } +zkvm_interface = { path = "../" } ethrex-core = { path = "../../../../../common", default-features = false } ethrex-rlp = { path = "../../../../../common/rlp" } -ethrex-vm = { path = "../../../../../vm", default-features = false } +ethrex-vm = { path = "../../../../../vm", default-features = false, features = [ + "l2", +] } ethrex-blockchain = { path = "../../../../../blockchain", default-features = false } [build-dependencies] diff --git a/crates/l2/prover/zkvm/interface/guest/src/main.rs b/crates/l2/prover/zkvm/interface/guest/src/main.rs index 425729ff7f..cad02e40ee 100644 --- a/crates/l2/prover/zkvm/interface/guest/src/main.rs +++ b/crates/l2/prover/zkvm/interface/guest/src/main.rs @@ -1,49 +1,49 @@ -use ethrex_rlp::{decode::RLPDecode, encode::RLPEncode, error::RLPDecodeError}; use risc0_zkvm::guest::env; use ethrex_blockchain::{validate_block, validate_gas_used}; -use ethrex_core::types::{Block, BlockHeader}; -use ethrex_vm::{execute_block, execution_db::ExecutionDB, get_state_transitions, EvmState}; +use ethrex_vm::{execute_block, get_state_transitions, EvmState}; +use zkvm_interface::{ + io::{ProgramInput, ProgramOutput}, + trie::update_tries, +}; fn main() { - let (block, execution_db, parent_header) = read_inputs().expect("failed to read inputs"); - let mut state = EvmState::from(execution_db.clone()); + let ProgramInput { + block, + parent_block_header, + db, + } = env::read(); + let mut state = EvmState::from(db.clone()); // Validate the block pre-execution - validate_block(&block, &parent_header, &state).expect("invalid block"); + validate_block(&block, &parent_block_header, &state).expect("invalid block"); // Validate the initial state - if !execution_db - .verify_initial_state(parent_header.state_root) - .expect("failed to verify initial state") - { - panic!("initial state is not valid"); - }; + let (mut state_trie, mut storage_tries) = db + .build_tries() + .expect("failed to build state and storage tries or state is not valid"); - let receipts = execute_block(&block, &mut state).unwrap(); - - env::commit(&receipts); + let initial_state_hash = state_trie.hash_no_commit(); + if initial_state_hash != parent_block_header.state_root { + panic!("invalid initial state trie"); + } + let receipts = execute_block(&block, &mut state).expect("failed to execute block"); validate_gas_used(&receipts, &block.header).expect("invalid gas used"); - let _account_updates = get_state_transitions(&mut state); - - // TODO: compute new state root from account updates and check it matches with the block's - // header one. -} - -fn read_inputs() -> Result<(Block, ExecutionDB, BlockHeader), RLPDecodeError> { - let head_block_bytes = env::read::>(); - let execution_db = env::read::(); - let parent_header_bytes = env::read::>(); + let account_updates = get_state_transitions(&mut state); - let block = Block::decode(&head_block_bytes)?; - let parent_header = BlockHeader::decode(&parent_header_bytes)?; + // Update tries and calculate final state root hash + update_tries(&mut state_trie, &mut storage_tries, &account_updates) + .expect("failed to update state and storage tries"); + let final_state_hash = state_trie.hash_no_commit(); - // make inputs public - env::commit(&block.encode_to_vec()); - env::commit(&execution_db); - env::commit(&parent_header.encode_to_vec()); + if final_state_hash != block.header.state_root { + panic!("invalid final state trie"); + } - Ok((block, execution_db, parent_header)) + env::commit(&ProgramOutput { + initial_state_hash, + final_state_hash, + }); } diff --git a/crates/l2/prover/zkvm/interface/src/lib.rs b/crates/l2/prover/zkvm/interface/src/lib.rs index ddec54513f..9e5b0a5ffb 100644 --- a/crates/l2/prover/zkvm/interface/src/lib.rs +++ b/crates/l2/prover/zkvm/interface/src/lib.rs @@ -7,3 +7,127 @@ pub mod methods { #[cfg(all(not(clippy), feature = "build_zkvm"))] include!(concat!(env!("OUT_DIR"), "/methods.rs")); } + +pub mod io { + use ethrex_core::{ + types::{Block, BlockHeader}, + H256, + }; + use ethrex_rlp::{decode::RLPDecode, encode::RLPEncode}; + use ethrex_vm::execution_db::ExecutionDB; + use serde::{Deserialize, Serialize}; + use serde_with::{serde_as, DeserializeAs, SerializeAs}; + + /// Private input variables passed into the zkVM execution program. + #[serde_as] + #[derive(Serialize, Deserialize)] + pub struct ProgramInput { + /// block to execute + #[serde_as(as = "RLPBlock")] + pub block: Block, + /// header of the previous block + pub parent_block_header: BlockHeader, + /// database containing only the data necessary to execute + pub db: ExecutionDB, + } + + /// Public output variables exposed by the zkVM execution program. Some of these are part of + /// the program input. + #[derive(Serialize, Deserialize)] + pub struct ProgramOutput { + /// initial state trie root hash + pub initial_state_hash: H256, + /// final state trie root hash + pub final_state_hash: H256, + } + + /// Used with [serde_with] to encode a Block into RLP before serializing its bytes. This is + /// necessary because the [ethrex_core::types::Transaction] type doesn't serializes into any + /// format other than JSON. + pub struct RLPBlock; + + impl SerializeAs for RLPBlock { + fn serialize_as(val: &Block, serializer: S) -> Result + where + S: serde::Serializer, + { + let mut encoded = Vec::new(); + val.encode(&mut encoded); + serde_with::Bytes::serialize_as(&encoded, serializer) + } + } + + impl<'de> DeserializeAs<'de, Block> for RLPBlock { + fn deserialize_as(deserializer: D) -> Result + where + D: serde::Deserializer<'de>, + { + let encoded: Vec = serde_with::Bytes::deserialize_as(deserializer)?; + Block::decode(&encoded).map_err(serde::de::Error::custom) + } + } +} + +pub mod trie { + use std::collections::HashMap; + + use ethrex_core::{types::AccountState, H160}; + use ethrex_rlp::{decode::RLPDecode, encode::RLPEncode, error::RLPDecodeError}; + use ethrex_storage::{hash_address, hash_key, AccountUpdate}; + use ethrex_trie::{Trie, TrieError}; + use thiserror::Error; + + #[derive(Debug, Error)] + pub enum Error { + #[error(transparent)] + TrieError(#[from] TrieError), + #[error(transparent)] + RLPDecode(#[from] RLPDecodeError), + #[error("Missing storage trie for address {0}")] + StorageNotFound(H160), + } + + pub fn update_tries( + state_trie: &mut Trie, + storage_tries: &mut HashMap, + account_updates: &[AccountUpdate], + ) -> Result<(), Error> { + for update in account_updates.iter() { + let hashed_address = hash_address(&update.address); + if update.removed { + // Remove account from trie + state_trie.remove(hashed_address)?; + } else { + // Add or update AccountState in the trie + // Fetch current state or create a new state to be inserted + let mut account_state = match state_trie.get(&hashed_address)? { + Some(encoded_state) => AccountState::decode(&encoded_state)?, + None => AccountState::default(), + }; + if let Some(info) = &update.info { + account_state.nonce = info.nonce; + account_state.balance = info.balance; + account_state.code_hash = info.code_hash; + } + // Store the added storage in the account's storage trie and compute its new root + if !update.added_storage.is_empty() { + let storage_trie = storage_tries + .get_mut(&update.address) + .ok_or(Error::StorageNotFound(update.address))?; + for (storage_key, storage_value) in &update.added_storage { + let hashed_key = hash_key(storage_key); + if storage_value.is_zero() { + storage_trie.remove(hashed_key)?; + } else { + storage_trie.insert(hashed_key, storage_value.encode_to_vec())?; + } + } + account_state.storage_root = storage_trie.hash_no_commit(); + } + state_trie.insert(hashed_address, account_state.encode_to_vec())?; + println!("inserted new state"); + } + } + Ok(()) + } +} diff --git a/crates/storage/trie/db.rs b/crates/storage/trie/db.rs index e2f5249de9..c124ff8484 100644 --- a/crates/storage/trie/db.rs +++ b/crates/storage/trie/db.rs @@ -3,6 +3,7 @@ pub mod in_memory; pub mod libmdbx; #[cfg(feature = "libmdbx")] pub mod libmdbx_dupsort; +pub mod null; use crate::error::TrieError; diff --git a/crates/storage/trie/db/null.rs b/crates/storage/trie/db/null.rs new file mode 100644 index 0000000000..69df1a52dd --- /dev/null +++ b/crates/storage/trie/db/null.rs @@ -0,0 +1,15 @@ +use super::TrieDB; +use crate::error::TrieError; + +/// Used for small/pruned tries that don't have a database and just cache their nodes. +pub struct NullTrieDB; + +impl TrieDB for NullTrieDB { + fn get(&self, _key: Vec) -> Result>, TrieError> { + Ok(None) + } + + fn put(&self, _key: Vec, _value: Vec) -> Result<(), TrieError> { + Ok(()) + } +} diff --git a/crates/storage/trie/trie.rs b/crates/storage/trie/trie.rs index 389bb5e7a5..af1e610e87 100644 --- a/crates/storage/trie/trie.rs +++ b/crates/storage/trie/trie.rs @@ -8,6 +8,9 @@ mod state; #[cfg(test)] mod test_utils; +use std::collections::HashSet; + +use db::null::NullTrieDB; mod trie_iter; mod verify_range; use ethereum_types::H256; @@ -40,8 +43,10 @@ lazy_static! { /// RLP-encoded trie path pub type PathRLP = Vec; -// RLP-encoded trie value +/// RLP-encoded trie value pub type ValueRLP = Vec; +/// RLP-encoded trie node +pub type NodeRLP = Vec; /// Libmdx-based Ethereum Compatible Merkle Patricia Trie pub struct Trie { @@ -136,10 +141,19 @@ impl Trie { .unwrap_or(*EMPTY_TRIE_HASH)) } + /// Return the hash of the trie's root node. + /// Returns keccak(RLP_NULL) if the trie is empty + pub fn hash_no_commit(&self) -> H256 { + self.root + .as_ref() + .map(|root| root.clone().finalize()) + .unwrap_or(*EMPTY_TRIE_HASH) + } + /// Obtain a merkle proof for the given path. /// The proof will contain all the encoded nodes traversed until reaching the node where the path is stored (including this last node). /// The proof will still be constructed even if the path is not stored in the trie, proving its absence. - pub fn get_proof(&self, path: &PathRLP) -> Result>, TrieError> { + pub fn get_proof(&self, path: &PathRLP) -> Result, TrieError> { // Will store all the encoded nodes traversed until reaching the node containing the path let mut node_path = Vec::new(); let Some(root) = &self.root else { @@ -155,48 +169,56 @@ impl Trie { Ok(node_path) } - pub fn verify_proof( - _proof: &[Vec], - _root_hash: NodeHash, - _path: &PathRLP, - _value: &ValueRLP, - ) -> Result { - // This is a mockup function for verifying proof of inclusions. This function will be - // possible to implement after refactoring the current Trie implementation. + /// Obtains all encoded nodes traversed until reaching the node where every path is stored. + /// The list doesn't include the root node, this is returned separately. + /// Will still be constructed even if some path is not stored in the trie. + pub fn get_proofs( + &self, + paths: &[PathRLP], + ) -> Result<(Option, Vec), TrieError> { + let Some(root_node) = self + .root + .as_ref() + .map(|root| self.state.get_node(root.clone())) + .transpose()? + .flatten() + else { + return Ok((None, Vec::new())); + }; - // We'll build a trie from the proof nodes and check whether: - // 1. the trie root hash is the one we expect - // 2. the trie contains the (key, value) pair to verify + let mut node_path = Vec::new(); + for path in paths { + let mut nodes = self.get_proof(path)?; + nodes.swap_remove(0); + node_path.extend(nodes); // skip root node + } - // We will only be using the trie's cache so we don't need a working DB + // dedup + // TODO: really inefficient, by making the traversing smarter we can avoid having + // duplicates + let node_path: HashSet<_> = node_path.drain(..).collect(); + let node_path = Vec::from_iter(node_path); + Ok((Some(root_node.encode_raw()), node_path)) + } + + /// Creates a cached Trie (with [NullTrieDB]) from a list of encoded nodes. + /// Generally used in conjuction with [Trie::get_proofs]. + pub fn from_nodes( + root_node: Option<&NodeRLP>, + other_nodes: &[NodeRLP], + ) -> Result { + let mut trie = Trie::new(Box::new(NullTrieDB)); + + if let Some(root_node) = root_node { + let root_node = Node::decode_raw(root_node)?; + trie.root = Some(root_node.insert_self(&mut trie.state)?); + } + + for node in other_nodes.iter().map(|node| Node::decode_raw(node)) { + node?.insert_self(&mut trie.state)?; + } - // let mut trie = Trie::stateless(); - - // Insert root into trie - // let mut proof = proof.into_iter(); - // let root_node = proof.next(); - // trie.root = Some(root_node.insert_self(path_offset, &mut trie.state)?); - - // Insert rest of nodes - // for node in proof { - // node.insert_self(path_offset, &mut trie.state)?; - // } - // let expected_root_hash = trie.hash_no_commit()?.into(); - - // Check key exists - // let Some(retrieved_value) = trie.get(path)? else { - // return Ok(false); - // }; - // // Check value is correct - // if retrieved_value != *value { - // return Ok(false); - // } - // // Check root hash - // if root_hash != expected_root_hash { - // return Ok(false); - // } - - Ok(true) + Ok(trie) } /// Builds an in-memory trie from the given elements and returns its hash @@ -1108,20 +1130,4 @@ mod test { let trie_proof = trie.get_proof(&a).unwrap(); assert_eq!(cita_proof, trie_proof); } - - #[test] - fn verify_proof_one_leaf() { - let mut trie = Trie::new_temp(); - trie.insert(b"duck".to_vec(), b"duckling".to_vec()).unwrap(); - - let root_hash = trie.hash().unwrap().into(); - let trie_proof = trie.get_proof(&b"duck".to_vec()).unwrap(); - assert!(Trie::verify_proof( - &trie_proof, - root_hash, - &b"duck".to_vec(), - &b"duckling".to_vec(), - ) - .unwrap()); - } } diff --git a/crates/vm/errors.rs b/crates/vm/errors.rs index a2635b5f6b..4fbd723edb 100644 --- a/crates/vm/errors.rs +++ b/crates/vm/errors.rs @@ -1,4 +1,4 @@ -use ethereum_types::H160; +use ethereum_types::{H160, H256}; use ethrex_core::types::BlockHash; use ethrex_storage::error::StoreError; use ethrex_trie::TrieError; @@ -37,8 +37,10 @@ pub enum ExecutionDBError { AccountNotFound(RevmAddress), #[error("Code by hash {0} not found")] CodeNotFound(RevmB256), - #[error("Storage value for address {0} and slot {1} not found")] - StorageNotFound(RevmAddress, RevmU256), + #[error("Storage for address {0} not found")] + StorageNotFound(RevmAddress), + #[error("Storage value for address {0} and key {1} not found")] + StorageValueNotFound(RevmAddress, RevmU256), #[error("Hash of block with number {0} not found")] BlockHashNotFound(u64), #[error("Missing account {0} info while trying to create ExecutionDB")] @@ -48,7 +50,17 @@ pub enum ExecutionDBError { #[error( "Missing storage trie of block {0} and address {1} while trying to create ExecutionDB" )] - NewMissingStorageTrie(BlockHash, RevmAddress), + NewMissingStorageTrie(BlockHash, H160), + #[error("The account {0} is not included in the stored pruned state trie")] + MissingAccountInStateTrie(H160), + #[error("Missing storage trie of account {0}")] + MissingStorageTrie(H160), + #[error("Storage trie root for account {0} does not match account storage root")] + InvalidStorageTrieRoot(H160), + #[error("The pruned storage trie of account {0} is missing the storage key {1}")] + MissingKeyInStorageTrie(H160, H256), + #[error("Storage trie value for account {0} and key {1} does not match value stored in db")] + InvalidStorageTrieValue(H160, H256), #[error("{0}")] Custom(String), } diff --git a/crates/vm/execution_db.rs b/crates/vm/execution_db.rs index af859d73eb..68c1acc63f 100644 --- a/crates/vm/execution_db.rs +++ b/crates/vm/execution_db.rs @@ -1,13 +1,13 @@ use std::collections::HashMap; -use ethereum_types::{Address, H160, U256}; +use ethereum_types::H160; use ethrex_core::{ types::{AccountState, Block, ChainConfig}, H256, }; use ethrex_rlp::encode::RLPEncode; use ethrex_storage::{hash_address, hash_key, Store}; -use ethrex_trie::Trie; +use ethrex_trie::{NodeRLP, Trie}; use revm::{ primitives::{ AccountInfo as RevmAccountInfo, Address as RevmAddress, Bytecode as RevmBytecode, @@ -17,10 +17,7 @@ use revm::{ }; use serde::{Deserialize, Serialize}; -use crate::{ - errors::{ExecutionDBError, StateProofsError}, - evm_state, execute_block, get_state_transitions, -}; +use crate::{errors::ExecutionDBError, evm_state, execute_block, get_state_transitions}; /// In-memory EVM database for caching execution data. /// @@ -38,17 +35,12 @@ pub struct ExecutionDB { pub block_hashes: HashMap, /// stored chain config pub chain_config: ChainConfig, - /// proofs of inclusion of account and storage values of the initial state - pub initial_proofs: StateProofs, -} - -/// Merkle proofs of inclusion of state values. -/// -/// Contains Merkle proofs to verfy the inclusion of values in the state and storage tries. -#[derive(Debug, Clone, Serialize, Deserialize, Default)] -pub struct StateProofs { - account: HashMap>>, - storage: HashMap>>>, + /// encoded nodes to reconstruct a state trie, but only including relevant data (pruned). + /// root node is stored separately from the rest. + pub pruned_state_trie: (Option, Vec), + /// encoded nodes to reconstruct every storage trie, but only including relevant data (pruned) + /// root nodes are stored separately from the rest. + pub pruned_storage_tries: HashMap, Vec)>, } impl ExecutionDB { @@ -103,33 +95,31 @@ impl ExecutionDB { ); } - // Compute Merkle proofs for the initial state values - let initial_state_trie = store.state_trie(block.header.parent_hash)?.ok_or( + // Get pruned state and storage tries. For this we get the "state" (all relevant nodes) of every trie. + // "Pruned" because we're only getting the nodes that make paths to the relevant + // key-values. + let state_trie = store.state_trie(block.header.parent_hash)?.ok_or( ExecutionDBError::NewMissingStateTrie(block.header.parent_hash), )?; - let initial_storage_tries = accounts - .keys() - .map(|address| { - Ok(( - H160::from_slice(address.as_slice()), - store - .storage_trie( - block.header.parent_hash, - H160::from_slice(address.as_slice()), - )? - .ok_or(ExecutionDBError::NewMissingStorageTrie( - block.header.parent_hash, - *address, - ))?, - )) - }) - .collect::, ExecutionDBError>>()?; - let initial_proofs = StateProofs::new( - &initial_state_trie, - &initial_storage_tries, - &address_storage_keys, - )?; + // Get pruned state trie + let state_paths: Vec<_> = address_storage_keys.keys().map(hash_address).collect(); + let pruned_state_trie = state_trie.get_proofs(&state_paths)?; + + // Get pruned storage tries for every account + let mut pruned_storage_tries = HashMap::new(); + for (address, keys) in address_storage_keys { + let storage_trie = store + .storage_trie(block.header.parent_hash, address)? + .ok_or(ExecutionDBError::NewMissingStorageTrie( + block.header.parent_hash, + address, + ))?; + let storage_paths: Vec<_> = keys.iter().map(hash_key).collect(); + let (storage_trie_root, storage_trie_nodes) = + storage_trie.get_proofs(&storage_paths)?; + pruned_storage_tries.insert(address, (storage_trie_root, storage_trie_nodes)); + } Ok(Self { accounts, @@ -137,7 +127,8 @@ impl ExecutionDB { storage, block_hashes, chain_config, - initial_proofs, + pruned_state_trie, + pruned_storage_tries, }) } @@ -145,101 +136,52 @@ impl ExecutionDB { self.chain_config } - /// Verifies that [self] holds the initial state (prior to block execution) with some root - /// hash. - pub fn verify_initial_state(&self, state_root: H256) -> Result { - self.verify_state_proofs(state_root, &self.initial_proofs) - } - - fn verify_state_proofs( - &self, - state_root: H256, - proofs: &StateProofs, - ) -> Result { - proofs.verify(state_root, &self.accounts, &self.storage) - } -} - -impl StateProofs { - fn new( - state_trie: &Trie, - storage_tries: &HashMap, - address_storage_keys: &HashMap>, - ) -> Result { - let mut account = HashMap::default(); - let mut storage = HashMap::default(); + /// Verifies that all data in [self] is included in the stored tries, and then builds the + /// pruned tries from the stored nodes. + pub fn build_tries(&self) -> Result<(Trie, HashMap), ExecutionDBError> { + let (state_trie_root, state_trie_nodes) = &self.pruned_state_trie; + let state_trie = Trie::from_nodes(state_trie_root.as_ref(), state_trie_nodes)?; + let mut storage_tries = HashMap::new(); - for (address, storage_keys) in address_storage_keys { - let storage_trie = storage_tries - .get(address) - .ok_or(StateProofsError::StorageTrieNotFound(*address))?; + for (revm_address, account) in &self.accounts { + let address = H160::from_slice(revm_address.as_slice()); - let proof = state_trie.get_proof(&hash_address(address))?; - let address = RevmAddress::from_slice(address.as_bytes()); - account.insert(address, proof); - - let mut storage_proofs = HashMap::new(); - for key in storage_keys { - let proof = storage_trie.get_proof(&hash_key(key))?; - let key = RevmU256::from_be_bytes(key.to_fixed_bytes()); - storage_proofs.insert(key, proof); + // check account is in state trie + if state_trie.get(&hash_address(&address))?.is_none() { + return Err(ExecutionDBError::MissingAccountInStateTrie(address)); } - storage.insert(address, storage_proofs); - } - - Ok(Self { account, storage }) - } - - fn verify( - &self, - state_root: H256, - accounts: &HashMap, - storages: &HashMap>, - ) -> Result { - // Check accounts inclusion in the state trie - for (address, account) in accounts { - let proof = self - .account - .get(address) - .ok_or(StateProofsError::AccountProofNotFound(*address))?; - let hashed_address = hash_address(&H160::from_slice(address.as_slice())); - let mut encoded_account = Vec::new(); - account.encode(&mut encoded_account); + let (storage_trie_root, storage_trie_nodes) = + self.pruned_storage_tries + .get(&address) + .ok_or(ExecutionDBError::MissingStorageTrie(address))?; - if !Trie::verify_proof(proof, state_root.into(), &hashed_address, &encoded_account)? { - return Ok(false); + // compare account storage root with storage trie root + let storage_trie = Trie::from_nodes(storage_trie_root.as_ref(), storage_trie_nodes)?; + if storage_trie.hash_no_commit() != account.storage_root { + return Err(ExecutionDBError::InvalidStorageTrieRoot(address)); } - } - // so all account storage roots are valid at this point - - // Check storage values inclusion in storage tries - for (address, storage) in storages { - let storage_root = accounts - .get(address) - .map(|account| account.storage_root) - .ok_or(StateProofsError::StorageNotFound(*address))?; - let storage_proofs = self + // check all storage keys are in storage trie and compare values + let storage = self .storage - .get(address) - .ok_or(StateProofsError::StorageProofsNotFound(*address))?; - + .get(revm_address) + .ok_or(ExecutionDBError::StorageNotFound(*revm_address))?; for (key, value) in storage { - let proof = storage_proofs - .get(key) - .ok_or(StateProofsError::StorageProofNotFound(*address, *key))?; - - let hashed_key = hash_key(&H256::from_slice(&key.to_be_bytes_vec())); - let encoded_value = U256::from_big_endian(&value.to_be_bytes_vec()).encode_to_vec(); - - if !Trie::verify_proof(proof, storage_root.into(), &hashed_key, &encoded_value)? { - return Ok(false); + let key = H256::from_slice(&key.to_be_bytes_vec()); + let value = H256::from_slice(&value.to_be_bytes_vec()); + let retrieved_value = storage_trie + .get(&hash_key(&key))? + .ok_or(ExecutionDBError::MissingKeyInStorageTrie(address, key))?; + if value.encode_to_vec() != retrieved_value { + return Err(ExecutionDBError::InvalidStorageTrieValue(address, key)); } } + + storage_tries.insert(address, storage_trie); } - Ok(true) + Ok((state_trie, storage_tries)) } } @@ -280,7 +222,7 @@ impl DatabaseRef for ExecutionDB { .ok_or(ExecutionDBError::AccountNotFound(address))? .get(&index) .cloned() - .ok_or(ExecutionDBError::StorageNotFound(address, index)) + .ok_or(ExecutionDBError::StorageValueNotFound(address, index)) } /// Get block hash by block number. diff --git a/crates/vm/vm.rs b/crates/vm/vm.rs index 89d9e24fe5..f17af79ce3 100644 --- a/crates/vm/vm.rs +++ b/crates/vm/vm.rs @@ -94,9 +94,10 @@ cfg_if::cfg_if! { state: &mut EvmState, ) -> Result<(Vec, Vec), EvmError> { let block_header = &block.header; - let spec_id = spec_id(&state.chain_config()?, block_header.timestamp); //eip 4788: execute beacon_root_contract_call before block transactions + #[cfg(not(feature = "l2"))] if block_header.parent_beacon_block_root.is_some() && spec_id == SpecId::CANCUN { + let spec_id = spec_id(&state.chain_config()?, block_header.timestamp); beacon_root_contract_call(state, block_header, spec_id)?; } let mut receipts = Vec::new(); @@ -198,8 +199,13 @@ cfg_if::cfg_if! { let block_header = &block.header; let spec_id = spec_id(&state.chain_config()?, block_header.timestamp); //eip 4788: execute beacon_root_contract_call before block transactions - if block_header.parent_beacon_block_root.is_some() && spec_id == SpecId::CANCUN { - beacon_root_contract_call(state, block_header, spec_id)?; + cfg_if::cfg_if! { + if #[cfg(not(feature = "l2"))] { + //eip 4788: execute beacon_root_contract_call before block transactions + if block_header.parent_beacon_block_root.is_some() && spec_id == SpecId::CANCUN { + beacon_root_contract_call(state, block_header, spec_id)?; + } + } } let mut receipts = Vec::new(); let mut cumulative_gas_used = 0;