Skip to content

Chainweb Merkle Tree

Lars Kuhtz edited this page Nov 23, 2023 · 30 revisions

Merke Log Trees

Chainweb uses the merkle-log package for the chain Merkle tree.

Merkle trees are balanced binary trees which can be nested by including the root of a sub-tree as leaf in a Merkle tree.

For a list of leaf nodes ls a Merkle tree is constructed as follows:

let
  go :: [Node] -> [(Int, Node)] -> IO ()

  -- Create new inner node
  go t ((h0, v0) : (h1, v1) : s) | h0 == h1 = go t ((h0 + 1, mkNode(v1, v0)) : s)

  -- Add new leaf to stack
  go (l : t) !s = go t ((0, l) : s)

  -- When all leafs are consumed, process remaining nodes on stack
  go [] ((h0, v0) : (_, v1) : s) = go [] ((a + 1, mkNode(v1, v0)) : s)

  -- done
  go _ [] [root] = return root
  go _ [] [] = error "impossible"
in
  go ls []

There are two types of leafs in a Merkle tree:

  1. Tree nodes: the value is the root of a sub-Merkle tree without additional hashing.
  2. Input nodes: the value is the tagged hash of the indexed data.

Node values are computed as follows:

  • leaf node:
    • for input node with data d : hash(0x00 <> tag(type(d)) <> bytes(d))
    • for tree node with sub-Merkle root r: r
  • inner node: hash(0x01 <> child_0 <> child_1)

The tag function is a domain specific function that disambiguates serialized input data based on type.

Chainweb Merkle Hash Function

The chainweb Merkle tree uses SHA512_256t (truncated SHA512) as hash function.

The domain specific function tag assigns each type a unique a 16-bit value in big endian encoding. For chainweb it is defined as follows:

    type MerkleTagVal ChainwebHashTag 'VoidTag = 0x0000
    type MerkleTagVal ChainwebHashTag 'MerkleRootTag = 0x0001
    type MerkleTagVal ChainwebHashTag 'ChainIdTag = 0x0002
    type MerkleTagVal ChainwebHashTag 'BlockHeightTag = 0x0003
    type MerkleTagVal ChainwebHashTag 'BlockWeightTag = 0x0004
    type MerkleTagVal ChainwebHashTag 'BlockPayloadHashTag = 0x0005
    type MerkleTagVal ChainwebHashTag 'FeatureFlagsTag = 0x0006
    type MerkleTagVal ChainwebHashTag 'BlockCreationTimeTag = 0x0007
    type MerkleTagVal ChainwebHashTag 'ChainwebVersionTag = 0x0008
    type MerkleTagVal ChainwebHashTag 'PowHashTag = 0x0009
    type MerkleTagVal ChainwebHashTag 'BlockHashTag = 0x0010
    type MerkleTagVal ChainwebHashTag 'HashTargetTag = 0x0011
    type MerkleTagVal ChainwebHashTag 'TransactionTag = 0x0013
    type MerkleTagVal ChainwebHashTag 'TransactionOutputTag = 0x0014
    type MerkleTagVal ChainwebHashTag 'BlockTransactionsHashTag = 0x0015
    type MerkleTagVal ChainwebHashTag 'BlockOutputsHashTag = 0x0016
    type MerkleTagVal ChainwebHashTag 'MinerDataTag = 0x0017
    type MerkleTagVal ChainwebHashTag 'CoinbaseOutputTag = 0x0018
    type MerkleTagVal ChainwebHashTag 'EpochStartTimeTag = 0x0019
    type MerkleTagVal ChainwebHashTag 'BlockNonceTag = 0x0020

    -- Event Proofs
    type MerkleTagVal ChainwebHashTag 'OutputEventsTag = 0x0030
    type MerkleTagVal ChainwebHashTag 'BlockEventsHashTag = 0x0031
    type MerkleTagVal ChainwebHashTag 'RequestKeyTag = 0x0032
    type MerkleTagVal ChainwebHashTag 'PactEventTag = 0x0034

The binary encoding for block headers is described on the Block Header Binary Encoding page.

Encodings for the payload Merkle trees are defined by the Pact smart contract language and Pact service in chainweb-node. The content of the payload is not relevant for chainweb consensus.

Block Header Merkle Tree

The depth of the tree and thus of Merkle proofs is logarithmic in the size of the block (header + payload).

The following diagram shows the structure of a Block Header Merkle tree for a twenty chain graph.

Legend:

  • diamond nodes: Merkle tree roots of nested Merkle trees
  • rounded nodes: data leafs
  • round nodes: internal Merkle tree nodes
  • trapezoid: sub-Merkle trees with the listed leaf nodes
  • sub-graphs: nested Merkle trees
graph TD
  subgraph bht [BlockHeader]
    direction TB
    
    %% root
    bhash{{BlockHash}}

    %% leafs
    bf([BlockFlags])
    bc([BlockCreationTime])
    bp{{BlockParentHash}}
    bt([BlockTarget])
    %% block payloads
    bi([BlockChainId])
    bw([BlockWeight])
    bh([BlockHeight])
    bv([BlockChainwebVersion])
    be([BlockEpochStart])
    bn([BlockNonce])
    ba1{{BlockAdjacentParentHash_1}}
    ba2{{BlockAdjacentParentHash_2}}
    ba3{{BlockAdjacentParentHash_3}}

    %% inner nodes
    n0( ) --> bf & bc
    n1( ) --> bp & bt
    n2( ) --> n0 & n1
    n3( ) --> bpt & bi
    n4( ) --> bw & bh
    n5( ) --> n3 & n4
    n6( ) --> n2 & n5
    n7( ) --> bv & be
    n8( ) --> bn & ba1
    n9( ) --> n7 & n8
    n10( ) --> ba2 & ba3
    n11( ) --> n9 & n10
    bhash --> n6 & n11
    
  subgraph bpt [BlockPayload]
    direction TB
    bph{{BlockPayloadHash}}
    bph --> btt
    bph --> bot

    subgraph btt [BlockTransactions]
      direction TB
      bth{{BlockTransactionsHash}}
      bth --> txs[/MinerData, Transactions \]
    end

    subgraph bot [BlockOutputs]
      direction TB
      boh{{BlockOutputsHash}}
      boh --> outs[/Coinbase, Outputs\]
    end
  end
  end
Loading

Chainweb Merkle Tree

Block header Merkle trees for a single Chainweb chain are linked linearly. The depth of the tree is thus linear in the block height. Hence the size of Merkle proofs for a range of blocks are linear in the size of the range.

Merkle trees for the chainweb chains are linked together in a DAG according to the Chainweb graph. Therefor with increasing depth there is an increasingly (exponentially) large number of different path between any two blocks.

A benefit of a linear tree is that a block at height n can be created and verified by using only blocks at height n - 1, which enables shallow nodes both for mining and validation.

The following diagram shows the overall structure of the chain Merkle tree:

graph TB
  subgraph b0[Block_0]
    direction TB
    r0{{BlockHash_0}}
    h0[/BlockHeader_0\]
    subgraph p0[Payload_0]
      x0[/ \]
    end
    r0 --> h0
    h0 --> p0
    %% h0 --> r0
  end
  subgraph b1[Block_1]
    direction TB
    r1{{BlockHash_1}}
    subgraph p1[Payload_1]
       x1[/ \]
    end
    h1[/BlockHeader_1\]
    r1 --> h1
    h1 --> p1
  end
  subgraph bn[Block_n]
    direction TB
    rn{{BlockHash_n}}
    subgraph pn[Payload_n]
       xn[/ \]
    end
    hn[/BlockHeader_n\]
    rn --> hn
    hn --> pn
  end
  subgraph bn1[Block_n-1]
    direction TB
    rn1{{BlockHash_n-1}}
    subgraph pn1[Payload_n-1]
       xn1[/ \]
    end
    hn1[/BlockHeader_n-1\]
    rn1 --> hn1
    hn1 --> pn1
  end
  subgraph bn2[Block_n-2]
    direction TB
    rn2{{BlockHash_n-2}}
    subgraph pn2[Payload_n-2]
       xn2[/ \]
    end
    hn2[/BlockHeader_n-2\]
    rn2 --> hn2
    hn2 --> pn2
  end

  an((OtherChains_n))
  an1((OtherChains_n-1))
  an2((OtherChains_n-2))
  a1((OtherChains_1))
  a0((OtherChains_0))

  hn ---> rn1
  hn1 ---> rn2
  hn2 -.....->|....| r1
  h1 ---> r0

  an ----> rn1
  hn ----> an1
  an1 ----> rn2
  hn1 ----> an2
  an2 -......->|....| r1
  hn2 -......->|....| a1
  a1 ----> r0
  h1 ----> a0

  an ~~~~~ an1
  a1 ~~~~~ a0
Loading

Chainweb Graphs

At genesis the Kadena mainnet was using the Petersen graph as Chainweb graph:

petersen-graph

The most recent Chainweb graph is the twenty chain graph:

twenty-chain-graph