Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOC-367] persistent ghosts explained with examples #177

Merged
merged 3 commits into from
Dec 25, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
133 changes: 131 additions & 2 deletions docs/cvl/ghosts.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,20 @@ These variables are often used to
- communicate information between {ref}`rules-main` and {ref}`hooks`.
- define deterministic [function summaries](https://github.com/Certora/Examples/blob/61ac29b1128c68aff7e8d1e77bc80bfcbd3528d6/CVLByExample/summary/with-env/WithEnvGhostSummary/WithEnv.spec#L10).

Ghosts can be seen as an 'extension' to the state of the contracts under verification.
This means that in case a call reverts, the ghost values will revert to their pre-state.
Additionally, if an unresolved call is handled by a havoc, the ghost values will havoc as well.
shellygr marked this conversation as resolved.
Show resolved Hide resolved
Ghosts are regarded as part of the state of the contracts,
and when calls are invoked with `at storageVar` statements (see {ref}`storage-type`),
they are restored to their state as saved in `storageVar`.
An exception to this rule are ghosts marked _persistent_.
shellygr marked this conversation as resolved.
Show resolved Hide resolved
Persistent ghosts are **never** havoced, and **never** reverted.
See {ref}`persistent-ghosts` below for more details and examples.

```{contents}
```

The syntax for ghost declarations is given by the following [EBNF grammar](ebnf-syntax):
The syntax for ghost and persistent ghost declarations is given by the following [EBNF grammar](ebnf-syntax):

Syntax
------
Expand All @@ -19,6 +29,9 @@ Syntax
ghost ::= "ghost" type id (";" | "{" axioms "}")
| "ghost" id "(" cvl_types ")" "returns" type (";" | "{" axioms "}")

persistent_ghost ::= "persistent" "ghost" type id (";" | "{" axioms "}")
| "persistent" "ghost" id "(" cvl_types ")" "returns" type (";" | "{" axioms "}")

type ::= basic_type
| "mapping" "(" cvl_type "=>" type ")"

Expand Down Expand Up @@ -172,4 +185,120 @@ Restrictions on ghost axioms
----------------------------
- A ghost axiom cannot refer to Solidity or CVL functions or to other ghosts. It can refer to the ghost itself.
- Since the signature of a ghost contains just parameter types without names, it cannot refer to its parameters.
`forall` can be used in order to refer the storage referred to by the parameters. [Example](https://github.com/Certora/Examples/blob/61ac29b1128c68aff7e8d1e77bc80bfcbd3528d6/CVLByExample/summary/ghost-summary/ghost-mapping/certora/specs/WithGhostSummary.spec#L12).
`forall` can be used in order to refer the storage referred to by the parameters. [Example](https://github.com/Certora/Examples/blob/61ac29b1128c68aff7e8d1e77bc80bfcbd3528d6/CVLByExample/summary/ghost-summary/ghost-mapping/certora/specs/WithGhostSummary.spec#L12).



(persistent-ghosts)=
Ghosts vs. persistent ghosts
----------------------------

In most cases, non-persistent ghosts are the natural choice for a specification
that requires extra tracking of information.

We present two examples where persistent ghosts are useful.

### Persistent ghosts that survive havocs
In the first example, we want to track the occurrence of a potential reentrant call[^reentrancy]:

[^reentrancy]: The example given here is very basic, the examples repository contains [more complete examples of reentrancy checks](https://github.com/Certora/Examples/tree/master/CVLByExample/Reentrancy)

```cvl
persistent ghost bool reentrancy_happened {
init_state axiom !reentrancy_happened;
}

hook CALL(uint g, address addr, uint value, uint argsOffset, uint argsLength,
uint retOffset, uint retLength) uint rc {
if (addr == currentContract) {
reentrancy_happened = reentrancy_happened
|| executingContract == currentContract;
}
}

invariant no_reentrant_calls !reentrancy_happened;
```

To see why a persistent ghost must be used here for the variable `reentrancy_happened`, consider the following contract:
```solidity
contract NotReentrant {
function transfer1Token(IERC20 a) external {
require (address(a) != address(this));
a.transfer(msg.sender, 1);
}
}
```

If we do not apply any linking or dispatching for the call done on the target `a`, the call to `transfer` would havoc.
During a havoc operation, the Prover conservatively assumes that almost any possible behavior can occur.
In particular, it must assume that during the execution of the `a.transfer` call,
non-persistent ghosts can be updated arbitrarily (e.g. by other contracts),
and thus (assuming `reentrancy_happenedt` were not marked as persistent),
the Prover considers the case where `reentrancy_happened` is set to `true` due to the havoc.
Thus, when the `CALL` hook executes immediately after,
it does so where the `reentrancy_happened` value is already `true`,
and thus the value after the hook will remain `true`.

In the lower-level view of the tool, the sequence of events is as follows:
1. A call to `a.transfer` which cannot be resolved and results in a {ref}`havoc <glossary>` operation.
Non-persistent ghosts are havoced, in particular `reentrancy_happened` if it were not marked as such.
2. A `CALL` hook executes, updating `reentrancy_happened` based on its havoced value, meaning it can turn to true.

Therefore even if the addresses of `a` and `NotReentrant` are distinct, we could still falsely detect a reentrant call as `reentrancy_happened` was set to true due to non-determinism.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is pretty opaque. Worth expanding on why this can happen:

"During a havoc operation, the prover conservatively assumes that almost any possible behavior can occur. In particular, it must assume that during the execution of the transfer call, some reentrancy occurs, and thus considers the case where reentrancy_happened is set to true. Thus, when the CALL hook executes, it does so where the reentrancy_happened value is already true, and thus the value after the hook will remain true."

or something

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reentrancy actually cannot occur. It does not matter because reentrancy_happened is in some state we track based on contracts, not tied to a particular contract. I added the wordier explanation, although for me it's harder to grasp

The call trace would show `reentrancy_happened` as being determined to be true due to a havoc in the "Ghosts State" view under "Global State".

### Persistent ghosts that survive reverts
In this example, we use persistent ghosts to determine if a revert happened with user-provided data or not.
This can help distinguishing between compiler-generated reverts and user-specified reverts (but only in [Solidity versions prior to 0.8.x](https://soliditylang.org/blog/2020/10/28/solidity-0.8.x-preview/)).
The idea is to set a ghost to true if a `REVERT` opcode is encountered with a positive buffer size. As in early Solidity versions the panic errors would compile to reverts with empty buffers, as well as user-provided `require`s with no message specified.

```cvl
persistent ghost bool saw_user_defined_revert_msg;

hook REVERT(uint offset, uint size) {
if (size > 0) {
saw_user_defined_revert_msg = true;
}
}

rule mark_methods_that_have_user_defined_reverts(method f, env e, calldataarg args) {
require !saw_user_defined_revert_msg;

f@withrevert(e, args);

satisfy saw_user_defined_revert_msg;
}
```

To see why a regular ghost cannot be used to implement this rule, let's consider the following trivial contract:
```solidity
contract Reverting {
function noUserDefinedRevertFlows(uint a, uint b) external {
uint c = a/b; // will see a potential division-by-zero revert
uint[] memory arr = new uint[](1);
uint d = arr[a+b]; // will see a potential out-of-bounds access revert;
}

function userDefinedRequireMsg(uint a) external {
require(a != 0, "a != 0");
}

function emptyRequire(uint a) external {
require(a != 0);
}
}
```

It is expected for the method `userDefinedRequireMsg` to satisfy the rule,
and it should be the only method to satisfy it.
Assuming `saw_user_defined_revert_msg` was defined as a regular, non-persistent ghost,
the rule would not be satisfied for `userDefinedRequireMsg`: in case the input argument `a` is equal to 0,
the contract reverts, and the value of `saw_user_defined_revert_msg`
is reset to its value before the call, which must be `false`
(because the rule required it before the call).
In this case, after the call `saw_user_defined_revert_msg` cannot be set to true and thus the `satisfy` fails.
Applying the same reasoning, it is clear that the same behavior happens
for reverting behaviors of `noUserDefinedRevertFlows` and `emptyRequire`,
which do not have user-defined revert messages.
This means that if `saw_user_defined_revert_msg` is not marked persistent,
the rule cannot distinguishing between methods that may revert with user-defined messages and methods that may not.
1 change: 1 addition & 0 deletions docs/user-guide/glossary.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
(glossary)=
Glossary
========

Expand Down