Skip to content
This repository has been archived by the owner on Jul 5, 2024. It is now read-only.

MPT initial #560

Closed
wants to merge 18 commits into from
Closed

MPT initial #560

wants to merge 18 commits into from

Conversation

miha-stopar
Copy link
Collaborator

@miha-stopar miha-stopar commented Jun 7, 2022

MPT (from mpt2 branch) reduced to only have two chips (to simplify the reviewing process):

  • branch hash in parent (checking hash of the branch is at the proper position in the parent branch)
  • branch RLC (checking the branch RLC is computed properly - used for a keccak lookup for the first point)

Tests are included also.

@adria0
Copy link
Member

adria0 commented Jun 7, 2022

@miha-stopar WoW!! 🌟

Since this is a really huge PR would you like to provide some kind of document to help and guide the reviewers?

@miha-stopar
Copy link
Collaborator Author

@miha-stopar WoW!! 🌟

Since this is a really huge PR would you like to provide some kind of document to help and guide the reviewers?

Oh no, that's the small one :). The purpose of this PR is to make mpt circuit easier to review. We reduced yesterday with @ChihChengLiang the original PR to only have two simple chips, so that the review process can start. But it will be reduced further and I agree - I will add more comments to the code to explain the internals. Otherwise, some initial docs are here.

@adria0
Copy link
Member

adria0 commented Jun 8, 2022

Oh no, that's the small one :)

LoL ;-)

@ChihChengLiang
Copy link
Collaborator

I think it'll improve more after another Clippy round and a coding style review. Then it would be easier to do a review on the content part.

Copy link
Collaborator

@ChihChengLiang ChihChengLiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some quick comments

mpt/src/mpt.rs Outdated
q_not_first,
account_leaf.is_account_leaf_in_added_branch,
is_last_branch_child,
s_advices[IS_BRANCH_C_PLACEHOLDER_POS - LAYOUT_OFFSET],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be c_advices??

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s_advices is ok. s_advices and c_advices in branch children rows are used for children hashes (that's why 32 columns). However, there is a special row before branch children rows, named branch_init. This row contains some RLP specific data, but also some other information, for example which of the children is being changed (going from s to c) and whether the branch is a regular branch or extension node. This last is used in the code above. All this additional info is stored in s_advices simply because there is enough space there and it is something that is the same for s and c.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Thanks!

let three_rlp_bytes_c = meta.query_advice(s_main.bytes[1], Rotation::cur());

let one = Expression::Constant(F::one());
constraints.push((
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use helpers::get_bool_constraint here?

mpt/src/param.rs Outdated
@@ -0,0 +1,58 @@
// Currently using 32 - each hash byte goes into its own cell, this might be
// compressed for optimization purposes in the future.
pub const HASH_WIDTH: usize = 32;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing documentation for these constants

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in 9c0c058.

}

#[derive(Default)]
struct ProofVariables<F> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Undocumented constants

@adria0
Copy link
Member

adria0 commented Jul 15, 2022

@miha-stopar, added a bunch of comments for the initial review :)

);
let c_root_rlc = bytes_into_rlc(
&row[l
- 3 * HASH_WIDTH
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that these constants should be hidden and kept defined or enumerated.

As an excercise I tryed to create a new struct called MptWitness and I think that makes code easier to read

#[derive(Eq, PartialEq, IntoPrimitive, TryFromPrimitive)]
#[repr(u8)]
pub enum MptWitnessRowType {
    InitBranch = 0,
    BranchChild = 1,
    StorageLeafSKey = 2,
    StorageLeafCKey = 3,
    HashToBeComputed = 5,
    AccountLeafKeyS = 6,
    AccountLeafKeyC = 4,
    AccountLeafNonceBalanceS = 7,
    AccountLeafNonceBalanceC = 8,
    AccountLeafRootCodehashS = 9,
    AccountLeafNeighbouringLeaf = 10,
    AccountLeafRootCodehashC = 11,
    StorageLeafSValue = 13,
    StorateLeafCValue = 14,
    NeighbouringStorageLeaf = 15, 
    ExtensionNodeS = 16,
    ExtensionNodeC = 17,
    NonExistingProof = 18
}

pub struct MptWitnessRow(pub Vec<u8>);
impl MptWitnessRow {
    pub fn get_type(&self) -> MptWitnessRowType {
        MptWitnessRowType::try_from(self.row(1)).unwrap()
    }
    fn row(&self, rev_index: usize) -> u8 {
        self.0[self.0.len() - rev_index]
    }
    pub fn not_first_level(&self) -> u8 {
        self.row(NOT_FIRST_LEVEL_POS)
    }

    pub fn is_storage_mod(&self) -> u8 {
        self.row(IS_STORAGE_MOD_POS)
    }
    ...
    pub fn s_root_bytes(&self) -> &[u8] { 
                            &self.0[self.0.len()
                                - 4 * HASH_WIDTH
                                - COUNTER_MptWitness_LEN
                                - IS_NON_EXISTING_ACCOUNT_POS
                                .. self.0.len() - 4 * HASH_WIDTH
                                    - COUNTER_MptWitness_LEN
                                    - IS_NON_EXISTING_ACCOUNT_POS
                                    + HASH_WIDTH]
    }
}

impl Index<usize> for MptWitnessRow {
    type Output = u8;
    fn index<'a>(&'a self, i: usize) -> &'a u8 {
        &self.0[i]
    }
}

pub struct MptWitness(pub Vec<MptWitnessRow>);
impl MptWitness {
    pub fn from(raw : &[Vec<u8>]) -> Self {
        let rows = raw.iter().map(|row| MptWitnessRow(row.clone())).collect();
        MptWitness(rows)
    }
}

impl Index<usize> for MptWitness {
    type Output = MptWitnessRow;
    fn index<'a>(&'a self, i: usize) -> &'a MptWitnessRow {
        &self.0[i]
    }
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!! Implemented in the main PR.

// Note: filter out rows that are just to be hashed. We cannot simply use ind instead of offset
// because hashing rows appear in the middle of other rows.
for (ind, row) in witness.iter().filter(|r| r[r.len() - 1] != 5).enumerate() {
if offset > 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we check here that offset > 0 , but, since the operation to be done is clearing pv, and in offset==0 is already cleared, this condition seems irrelevant

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's needed because there is no previous row when offset = 0.

)?;

pv.node_index += 1;
} else if row[row.len() - 1] != 5 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

previously you filter the rows with witness.iter().filter(|r| r[r.len() - 1] != 5).enumerate() , so this condition it will be always true?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, removed. 👍

// Branch (length 340) with three bytes of RLP meta data
// [249,1,81,128,16,...

pv.acc_s = F::from(row[BRANCH_0_S_START] as u64)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that all those row[BRANCH_0_S_start + x] can be removed if your create some variables over them like

let s_len = [0, 1, 2].map(|i| row[BRANCH_0_S_START + i] as u64);, then you can use s_len[0], s_len[1], s_len[2]

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, added in db1a700

Ok(())
}

fn assign_branch_init(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO is better to put all the specific code that relates to branch_init here, it's easy to pass here a pv: &mut ProofVariables

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree :), I planned to refactor this when adding branch assignment in branch related files, but this the refactor you proposed now already in bd867f5

Comment on lines +1309 to +1310
self.load_keccak_table(_layouter, to_be_hashed).ok();
self.load_fixed_table(_layouter).ok();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.load_keccak_table(_layouter, to_be_hashed).ok();
self.load_fixed_table(_layouter).ok();
self.load_keccak_table(_layouter, to_be_hashed)?;
self.load_fixed_table(_layouter)?;


// BranchRLCChip verifies the random linear combination for the branch which is
// then used to check the hash of a branch.
impl<F: Field> BranchRLCConfig<F> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Config or Chip? IMO makes more sense if is Chip

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it suffices to have Config. Or do you think it's better to have Chips?

_marker: PhantomData<F>,
}

// BranchRLCChip verifies the random linear combination for the branch which is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that branch_rlc_init works for S and C, but branch_rlc only works for one (S or C). Why this? I think that is going to be more clear to have a unique branch_rlc ( that are instantiated twice, one for S and other for C) that also includes init branch row.

meta: &mut ConstraintSystem<F>,
q_enable: impl FnOnce(&mut VirtualCells<'_, F>) -> Expression<F>,
main: MainCols,
branch_acc: Column<Advice>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is called acc_s in branch_rlc_init.rs, what about renaming it to acc? (or renaming it in branch_rlc_init.rs to branch_acc_s ?)


// Short RLP, meta data contains two bytes: 248, 81
// [1,0,1,0,248,81,0,248,81,0,3,0,0,0,...
// The length of RLP stream is: 81.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not to encode the C RLP length in the "C" section of the circuit (the second half of rows)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All such information is stored in the S part in the branch init row (also for example whether the branch S or branch C are placeholders ...). Agree, it might be better to store C related info into C part, but then there is also information like drifted_pos which applies to both parts, so there would be some asymmetry in storing these fields in any case. But might be changed later, it should be easy.

@miha-stopar
Copy link
Collaborator Author

@miha-stopar, added a bunch of comments for the initial review :)

Thanks @adria0 ! I realised it's better to apply the modifications you propose in the original PR (otherwise the two PRs will go too far apart), will see later whether to proceed with this PR or not. I will do these changes in parallel with improving the specs - there will be quite some new structs introduced and I would like to properly document them in the specs.

@miha-stopar
Copy link
Collaborator Author

Closed in favour of #398.

0x8f701 pushed a commit to dompute/zkevm-circuits that referenced this pull request Sep 13, 2023
)

* refactor: update the default sign_data

* clean keccak_inputs_tx_circuit

* chore: clippy fix

* add sanity check

* fix

---------

Co-authored-by: Rohit Narurkar <[email protected]>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants