Skip to content
This repository has been archived by the owner on Oct 3, 2024. It is now read-only.

Commit

Permalink
fix: compiler docs revamp (#870)
Browse files Browse the repository at this point in the history
  • Loading branch information
hedgar2017 authored Jan 23, 2024
1 parent 6523456 commit 7110105
Show file tree
Hide file tree
Showing 29 changed files with 1,921 additions and 2,220 deletions.
67 changes: 60 additions & 7 deletions docs/.vuepress/sidebar/en.ts
Original file line number Diff line number Diff line change
Expand Up @@ -792,21 +792,74 @@ export const enSidebar = sidebar({
},
{
text: "EVM",
link: "/zk-stack/components/compiler/specification/instructions/evm.md"
collapsible: true,
children: [
{
text: "Overview",
link: "/zk-stack/components/compiler/specification/instructions/evm/overview.md",
},
{
text: "Arithmetic",
link: "/zk-stack/components/compiler/specification/instructions/evm/arithmetic.md",
},
{
text: "Logical",
link: "/zk-stack/components/compiler/specification/instructions/evm/logical.md",
},
{
text: "Bitwise",
link: "/zk-stack/components/compiler/specification/instructions/evm/bitwise.md",
},
{
text: "Hashes",
link: "/zk-stack/components/compiler/specification/instructions/evm/hashes.md",
},
{
text: "Environment",
link: "/zk-stack/components/compiler/specification/instructions/evm/environment.md",
},
{
text: "Block",
link: "/zk-stack/components/compiler/specification/instructions/evm/block.md",
},
{
text: "Stack",
link: "/zk-stack/components/compiler/specification/instructions/evm/stack.md",
},
{
text: "Memory",
link: "/zk-stack/components/compiler/specification/instructions/evm/memory.md",
},
{
text: "Storage",
link: "/zk-stack/components/compiler/specification/instructions/evm/storage.md",
},
{
text: "Events",
link: "/zk-stack/components/compiler/specification/instructions/evm/events.md",
},
{
text: "Calls",
link: "/zk-stack/components/compiler/specification/instructions/evm/calls.md",
},
{
text: "CREATE",
link: "/zk-stack/components/compiler/specification/instructions/evm/create.md",
},
{
text: "Return",
link: "/zk-stack/components/compiler/specification/instructions/evm/return.md",
},
]
},
{
text: "EVMLA",
text: "EVM Legacy Assembly",
link: "/zk-stack/components/compiler/specification/instructions/evmla.md"
},
{
text: "Yul",
link: "/zk-stack/components/compiler/specification/instructions/yul.md"
},
{
text: "Extensions",
link: "/zk-stack/components/compiler/specification/instructions/extensions.md"

},
]
}
]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,6 @@ head:

# Code Separation

# Deploy and Runtime Code Separation

On both EVM and EraVM the code is separated into two parts: deploy code and runtime code. The deploy code is executed
only once, when the contract is deployed. The runtime code is executed every time the contract is called. However, on
EraVM the deploy code and runtime code are deployed together, and they are not split into two separate chunks of
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,9 @@ There are two Solidity IRs used in our pipeline: Yul and EVM legacy assembly. Th
Solidity, more precisely <=0.7.

EVM legacy assembly is very challenging to translate to LLVM IR, since it obfuscates the control flow of the program and
uses a lot of dynamic jumps. Most of the jumps can be translated to static ones by using a static analysis of the
program, but some of them are impossible to resolve statically. For example, internal function pointers can be written
to memory or storage and then loaded and called. Recursion is another case we have skipped for now, as there is another
uses a lot of dynamic jumps. Most of the jumps can be translated to static ones by using a static analysis of EVM assembly,
but some of jumps are impossible to resolve statically. For example, internal function pointers can be written
to memory or storage, and then loaded and called. Recursion is another case we have skipped for now, as there is another
stack frame allocated on every iteration, preventing the static analyzer from resolving the jumps.

Both issues are being worked on in our fork of the Solidity compiler, where we are changing the codegen to remove the
Expand All @@ -30,13 +30,13 @@ contract Example {
result = 42;
}
}
```

## EVM Legacy Assembly

Produced by the upstream Solidity compiler v0.7.6.

```evm
| Line | Instruction | Value/Tag |
| ---- | ------------ | --------- |
| 000 | PUSH | 80 |
Expand Down Expand Up @@ -99,8 +99,7 @@ Produced by the upstream Solidity compiler v0.7.6.
| 057 | PUSH | 2A |
| 058 | SWAP1 | |
| 059 | JUMP | [out] |

````
```

## EthIR

Expand Down Expand Up @@ -227,7 +226,7 @@ block_rt_5/0: (predecessors: rt_3/0) // Runtime Code Tag 5, Instance 0.
SWAP1 [ V_SHR | 2A | T_4 ]
// JUMP [out] is usually a return statement
JUMP [out] [ V_SHR | 2A ] - [ T_4 ]
````
```

### Unoptimized LLVM IR

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,23 +9,22 @@ head:

This document explains some peculiarities of the exception handling (EH) in zkEVM architecture.

In a nutshell, there are two exception handling mechanisms in zkEVM: contract-level and function-level. The former is
more common to general-purpose languages, and the latter was inherited from the EVM architecture.

| | Contract Level | Function Level |
| ------------------------------------------------------ | ----------------------- | ---------------------------------------------- |
| Yul examples | revert(0, 0) | verbatim("throw") |
| Native to | EVM | General-purpose languages |
| Handled by | zkEVM | Compiler |
| Catchable | By the calling contract | By the calling function |
| Efficient | Yes | Huge size impact due to numerous catch blocks. |
| Extra cycles are needed for propagating the exception. |
In a nutshell, there are two EH mechanisms in zkEVM: contract-level and function-level.
The former was inherited from the EVM architecture, and the latter is more common to general-purpose languages.

| | Contract Level | Function Level |
| ------------ | --------------- | ----------------------------------------------------------------------------------------------------- |
| Yul Example | revert(0, 0) | verbatim("throw") |
| Native to | EVM | General-purpose languages |
| Handled by | zkEVM | Compiler |
| Catchable by | Caller contract | Caller function |
| Efficient | Yes | Huge size impact due to numerous catch blocks. Extra cycles are needed for propagating the exception. |

## Contract Level

This type of exceptions is inherited from the EVM architecture. On EVM, such instructions as `REVERT` and `INVALID`,
immediately terminate the contract execution and return the control to the callee. It is impossible to catch them within
the contract, and it can be only done on the callee side with checking the call status code.
immediately terminate the contract execution and return the control to the callee. It is impossible to catch them
within the contract, and it can be only done on the callee side with checking the call status code.

```solidity
// callee
Expand All @@ -43,35 +42,34 @@ if iszero(success) {
}
```

zkEVM behaves exactly the same. The VM automatically unwinds the call stack up to the uppermost function frame of the
contract, leaving no possibility to catch and handle it on the way.
zkEVM behaves exactly the same. The VM automatically unwinds the call stack up to the uppermost function frame
of the contract, leaving no possibility to catch and handle it on the way.

These types of exceptions are more efficient, as you can revert at any point of the execution without propagating
the control flow all the way up to the uppermost function frame.

### Implementation

These types of exceptions are more efficient, as you can revert at any point of the execution without propagating the
control flow all the way up to the uppermost function frame.
In EraVM, contracts call each other using [`far_call` instruction](https://matter-labs.github.io/eravm-spec/spec.html#FarCalls).
It [accepts the address of the exception handler](https://matter-labs.github.io/eravm-spec/spec.html#OpFarCall) as one of its arguments.

## Function Level

This type of exceptions is more common to general-purpose languages like C++. That is why it was easy to support within
the LLVM framework, even though it is not supported by the smart contract languages we work with. That is also one of
the reasons why the two EH mechanisms are handled separately and barely interact in the high-level code.
This type of exceptions is more common to general-purpose languages like C++. That is why it was easy to support
within the LLVM framework, even though it is not supported by the smart contract languages we work with.
That is also one of the reasons why the two EH mechanisms are handled separately and barely interact in the high-level code.

In general-purpose languages a set of EH tools is usually available, e.g. `try` , `throw`, and `catch` keywords that
define which piece of code may throw and how the exception must be handled. However, these tools are not available in
Solidity and its EVM Yul dialect, so some extensions have been added in the zkEVM Yul dialect compiled by zksolc, but
there are limitations, some of which are dictated by the nature of smart contracts:
define which piece of code may throw and how the exception must be handled. However, these tools are not available
in Solidity and its EVM Yul dialect, so some extensions have been added in the zkEVM Yul dialect compiled by zksolc,
but there are limitations, some of which are dictated by the nature of smart contracts:

1. Every function beginning with `ZKSYNC_NEAR_CALL` is implicitly wrapped with `try`. If there is an exception handler
defined, the following will happen:
1. Every function beginning with `ZKSYNC_NEAR_CALL` is implicitly wrapped with `try`. If there is an exception handler defined, the following will happen:
- A panic will be caught by the caller of such function.
- The control will be transferred to EH function. There can be only one EH function and it must be named
`ZKSYNC_CATCH_NEAR_CALL`. It is not very efficient, because all functions must have an LLVM IR `catch` block that
will catch and propagate the exception and call the EH function.
- The control will be transferred to EH function. There can be only one EH function and it must be named `ZKSYNC_CATCH_NEAR_CALL`. It is not very efficient, because all functions must have an LLVM IR `catch` block that will catch and propagate the exception and call the EH function.
- When the EH function has finished executing, the caller of `ZKSYNC_NEAR_CALL` receives the control back.
2. Every operation is `throw`. Since any instruction can panic due to out-of-gas, all of them can throw. It is another
thing reducing the potential for optimizations.
3. The `catch` block is represented by the `ZKSYNC_CATCH_NEAR_CALL` function in Yul. A panic in `ZKSYNC_NEAR_CALL` will
make **their caller** catch the exception and call the EH function. After the EH function is executed, the control is
returned to the caller of `ZKSYNC_NEAR_CALL`.
2. Every operation is `throw`. Since any instruction can panic due to out-of-gas, all of them can throw. It is another thing reducing the potential for optimizations.
3. The `catch` block is represented by the `ZKSYNC_CATCH_NEAR_CALL` function in Yul. A panic in `ZKSYNC_NEAR_CALL` will make **their caller** catch the exception and call the EH function. After the EH function is executed, the control is returned to the caller of `ZKSYNC_NEAR_CALL`.

```solidity
// Follow the numbers for the order of execution. The call order is:
Expand Down Expand Up @@ -103,6 +101,29 @@ function ZKSYNC_CATCH_NEAR_CALL() { // 07
}
```

Having all the overhead above, the `catch` blocks are only generated if there is the EH function
`ZKSYNC_CATCH_NEAR_CALL` defined in the contract. Otherwise there is no need to catch panics and they will be propagated
to the callee contract automatically by the VM execution environment.
Having all the overhead above, the `catch` blocks are only generated if there is the EH function `ZKSYNC_CATCH_NEAR_CALL`
defined in the contract. Otherwise there is no need to catch panics and they will be propagated to the callee contract
automatically by the VM execution environment.

### Implementation

In EraVM, there are two ways of implementing contract-local function calls:

1. Saving the return address and using a [`jump`](https://matter-labs.github.io/eravm-spec/spec.html#JumpDefinition) instruction to call; using [`jump`](https://matter-labs.github.io/eravm-spec/spec.html#JumpDefinition) instruction with saved return address to return.
2. Using
[`call`](https://matter-labs.github.io/eravm-spec/spec.html#NearCallDefinition)
instruction to call; using one of `ret` instructions with modifiers
[`ok`](https://matter-labs.github.io/eravm-spec/spec.html#NearRetDefinition),
[`revert`](https://matter-labs.github.io/eravm-spec/spec.html#NearRevertDefinition), or
[`panic`](https://matter-labs.github.io/eravm-spec/spec.html#step_oppanic) to return.

Using `jump` is more lightweight and cheaper, but using `call`/`ret` is more feature-rich:

1. In case of panic or revert, the storage effects and queues of this function are rolled back.
2. It is possible to pass a portion of available gas; the unused gas will be returned to the caller, unless the function panicked.
3. It is possible to set up a custom exception handler.

Prefixing Yul function name with `ZKSYNC_NEAR_CALL_` allows to use this
additional, platform-specific functionality, implemented by the `call`
instruction. For other functions, the choice between `call`/`ret` or `jump` is
up to the compiler.
Loading

0 comments on commit 7110105

Please sign in to comment.