-
Notifications
You must be signed in to change notification settings - Fork 95
Chainweb Merkle Tree
Chainweb uses the merkle-log package for the chain Merkle tree (see. merkle-log-rs for a Rust version).
Merkle trees are balanced binary trees which can be nested by including the root of a sub-tree as leaf in a Merkle tree.
A node in a Merkle tree is represented as the hash of its subtrees.
data Node = Node Digest
There are two types of leafs in a Merkle tree:
- Tree nodes: the value is the root of a nested sub-Merkle tree. The value is used without additional hashing or tagging.
- Input nodes: the value is the tagged hash of the indexed data.
The Tag
values are domain specific values that disambiguate serialized
input data based on the type (cf. below).
data Input = SubTree Node | Leaf Tag ByteString
mkLeafNode :: Input -> Node
mkLeafNode (Leaf t d) hash ([0x00] <> t <> d)
mkLeafNode (SubTree n) = n
mkInnerNode :: Node -> Node -> Node
mkInnerNode l r = hash ([0x01] <> l <> r)
For a list of leaf nodes ls
a Merkle tree is constructed as follows:
merkleRoot :: [Input] -> Node
merkleRoot ls = go ls []
where
go :: [Input] -> [(Int, Node)] -> Node
-- empty tree
go [] [] = hash ""
-- Create new inner node
go t ((h0, v0) : (h1, v1) : s) | h0 == h1 = go t ((h0 + 1, mkInnerNode(v1, v0)) : s)
-- Add new leaf to stack
go (l : t) !s = go t ((0, mkLeafNode l) : s)
-- When all leafs are consumed, process remaining nodes on stack
go [] ((h0, v0) : (_, v1) : s) = go [] ((a + 1, mkInnerNode(v1, v0)) : s)
-- done
go [] [root] = root
The chainweb Merkle tree uses NIST SHA-512/256 from the SHA2 family as hash function.
The domain specific function tag
assigns each type a unique a 16-bit value in
big endian encoding before the value is hashed (in addition to the Merkle node
tag as described above). The following tag values are used in Chainweb:
ChainIdTag = 0x0002
BlockHeightTag = 0x0003
BlockWeightTag = 0x0004
FeatureFlagsTag = 0x0006
BlockCreationTimeTag = 0x0007
ChainwebVersionTag = 0x0008
HashTargetTag = 0x0011
TransactionTag = 0x0013
TransactionOutputTag = 0x0014
MinerDataTag = 0x0017
CoinbaseOutputTag = 0x0018
EpochStartTimeTag = 0x0019
BlockNonceTag = 0x0020
-- Event Proofs (currently not indexed in the header tree)
OutputEventsTag = 0x0030
BlockEventsHashTag = 0x0031
RequestKeyTag = 0x0032
PactEventTag = 0x0034
- diamond nodes: Merkle tree roots of nested Merkle trees
- rounded nodes: data leafs
- round nodes: internal Merkle tree nodes
- trapezoid: sub-Merkle trees with the listed leaf nodes
- sub-graphs: nested Merkle trees
Transactions and transaction outputs are indexed in separate Merkle trees. Usually, SPV proofs only use transaction outputs.
The list of transactions and transaction outputs are of the same size.
The order of transactions and outputs is the order in which they are processed
within the block, which is also order in which they are returned in API queries.
MinerData
is the first entry in the transaction tree and Coinbase
the first
entry in the outputs tree.
MinerData
and Coinbase
are always present. The
lists of tranactions and outputs can be empty for some blocks.
graph TD
subgraph bpt [BlockPayload]
direction TB
bph{{BlockPayloadHash}}
bph --> btt
bph --> bot
subgraph btt [BlockTransactions]
direction TB
bth{{BlockTransactionsHash}}
bth --> txs[/MinerData, Transactions \]
end
subgraph bot [BlockOutputs]
direction TB
boh{{BlockOutputsHash}}
boh --> outs[/Coinbase, Outputs\]
end
end
The REST API of chainweb-node returns the payload structures in base64Url encoding. For the Merkle tree the unencoded binary content (usually some unicode text) is hashed (subject to proper tagging as described above).
The format of payload data is defined by the Pact smart contract language and Pact service in chainweb-node. The content of the payload is not relevant for chainweb consensus. Details of the payload structure can be found in the wikipage about the Chain database.
The binary encoding for block headers is described on the Block Header Binary Encoding page.
The order of the fields block header fields in the Merkle tree is as follows (cf. implementation):
(flags, time, parent, target, payload, chain, weight, height, version, epoch_start, nonce) + adjacent_hashes
The adjacent_hashes
are sorted numerically by chain id.
The depth of the tree and thus of Merkle proofs is logarithmic in the size of the block (header + payload).
The following diagram shows the structure of a Block Header Merkle tree for a twenty chain graph.
graph TD
subgraph bht [BlockHeader]
direction TB
%% root
bhash{{BlockHash}}
%% leafs
bf([BlockFlags])
bc([BlockCreationTime])
bp{{BlockParentHash}}
bt([BlockTarget])
bpt{{BlockPayloadHash}}
bi([BlockChainId])
bw([BlockWeight])
bh([BlockHeight])
bv([BlockChainwebVersion])
be([BlockEpochStart])
bn([BlockNonce])
ba1{{BlockAdjacentParentHash_1}}
ba2{{BlockAdjacentParentHash_2}}
ba3{{BlockAdjacentParentHash_3}}
%% inner nodes
n0( ) --> bf & bc
n1( ) --> bp & bt
n2( ) --> n0 & n1
n3( ) --> bpt
n3( ) --> bi
n4( ) --> bw & bh
n5( ) --> n3 & n4
n6( ) --> n2 & n5
n7( ) --> bv & be
n8( ) --> bn & ba1
n9( ) --> n7 & n8
n10( ) --> ba2 & ba3
n11( ) --> n9 & n10
bhash --> n6 & n11
end
Block header Merkle trees for a single Chainweb chain are linked linearly. The depth of the tree is thus linear in the block height. Hence the size of Merkle proofs for a range of blocks are linear in the size of the range.
Merkle trees for the chainweb chains are linked together in a DAG according to the Chainweb graph. Therefor with increasing depth there is an increasingly (exponentially) large number of different path between any two blocks.
A benefit of a linear tree is that a block at height n
can be created and
verified by using only blocks at height n - 1
, which enables shallow nodes
both for mining and validation. Even with full nodes, a balanced Merkle tree
over block history would require random access history lookup during block
creation and validation. Making the tree creation and validation inductive (so
that only constant depth history lookups are needed), would add logarithmic (in
the block history size) storage overhead to the block headers.
The following diagram shows the overall structure of the chain Merkle tree:
graph TB
subgraph b0[Block_0]
direction TB
r0{{BlockHash_0}}
h0[/BlockHeader_0\]
subgraph p0[Payload_0]
x0[/ \]
end
r0 --> h0
h0 --> p0
%% h0 --> r0
end
subgraph b1[Block_1]
direction TB
r1{{BlockHash_1}}
subgraph p1[Payload_1]
x1[/ \]
end
h1[/BlockHeader_1\]
r1 --> h1
h1 --> p1
end
subgraph bn[Block_n]
direction TB
rn{{BlockHash_n}}
subgraph pn[Payload_n]
xn[/ \]
end
hn[/BlockHeader_n\]
rn --> hn
hn --> pn
end
subgraph bn1[Block_n-1]
direction TB
rn1{{BlockHash_n-1}}
subgraph pn1[Payload_n-1]
xn1[/ \]
end
hn1[/BlockHeader_n-1\]
rn1 --> hn1
hn1 --> pn1
end
subgraph bn2[Block_n-2]
direction TB
rn2{{BlockHash_n-2}}
subgraph pn2[Payload_n-2]
xn2[/ \]
end
hn2[/BlockHeader_n-2\]
rn2 --> hn2
hn2 --> pn2
end
an((OtherChains_n))
an1((OtherChains_n-1))
an2((OtherChains_n-2))
a1((OtherChains_1))
a0((OtherChains_0))
hn ---> rn1
hn1 ---> rn2
hn2 -.....->|....| r1
h1 ---> r0
an ----> rn1
hn ----> an1
an1 ----> rn2
hn1 ----> an2
an2 -......->|....| r1
hn2 -......->|....| a1
a1 ----> r0
h1 ----> a0
an ~~~~~ an1
a1 ~~~~~ a0
At genesis the Kadena mainnet was using the Petersen graph as Chainweb graph:
Since block height 852,054 (~2020-08-20 16:00:00) the Chainweb on the Kadena mainnet graph is the twenty chain graph:
The graphs have the following properties:
Petersen graph | twenty chain graph | |
---|---|---|
order | 10 | 20 |
size | 30 | 60 |
degree | 3 | 3 |
diameter | 2 | 3 |
Petersen graph:
\documentclass[tikz,crop]{standalone}
\usetikzlibrary{math}
\usetikzlibrary{graphs.standard}
\begin{document}
\begin{tikzpicture}
\graph [clockwise,n=5, nodes={draw, circle}] {
{ 0; 1; 2; 3; 4; } --
subgraph C_n [name=a, V={5, 6, 7, 8, 9}];
0 -- 2 -- 4 -- 1 -- 3 -- 0;
};
\end{tikzpicture}
\end{document}
Twenty chain graph:
\documentclass[tikz,crop]{standalone}
\usetikzlibrary{math}
\usetikzlibrary{graphs.standard}
\begin{document}
\begin{tikzpicture}
\graph [clockwise,n=5, nodes={draw, circle}] {
{ 5; 6; 7; 8; 9; } --
subgraph I_n [V={0,1,2,3,4}] --[matching]
{
subgraph I_n [phase=100, V={10,16,12,18,14}];
% how can we draw edges here?
subgraph I_n [phase=80, V={15,11,17,13,19}];
};
subgraph I_n [V={0,1,2,3,4}] --
subgraph I_n [phase=80, V={15,11,17,13,19}];
5 -- 7 -- 9 -- 6 -- 8 -- 5;
10 -- 11 -- 12 -- 13 -- 14 --
15 -- 16 -- 17 -- 18 -- 19 -- 10;
};
\end{tikzpicture}
\end{document}