Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC-343: Unsat Cores Visualisation #153

Merged
merged 20 commits into from
Oct 10, 2024
Merged
Show file tree
Hide file tree
Changes from 12 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added docs/prover/checking/coverage-info-button.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
147 changes: 147 additions & 0 deletions docs/prover/checking/coverage-info.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,147 @@
# Coverage Info via Unsat Cores

The `--coverage_info [none|basic|advanced]` flag enables automatic computation of `.sol` and `.spec` files coverage w.r.t. the underlying verification task. In particular, using this flag can help you answer questions such as:
jar-ben marked this conversation as resolved.
Show resolved Hide resolved
jar-ben marked this conversation as resolved.
Show resolved Hide resolved

* *Are all solidity functions from the input involed in proving my rules?*
jar-ben marked this conversation as resolved.
Show resolved Hide resolved
* *Are all solidity commands from the input involed in proving my rules?*
* *Supposing an `assert` in my rule is not reachable, what is the reason for the unreachability?*
* *Do I really need all hooks that are defined in my .spec file(s)?*
* *Do I really need all `require` statements in my rule?*
* *Do I really need to initialise a CVL variable in my rule?*
jar-ben marked this conversation as resolved.
Show resolved Hide resolved
* *Do I really need all preserved blocks in my CVL `invariant`?*
jar-ben marked this conversation as resolved.
Show resolved Hide resolved

To answer the above questions, the Certora Prover generates a so-called *unsat core* which, intuitively, represents the minimal subset of the commands in the input `.sol` and `.spec` files that are needed to prove the CVL properties. If some of the input `.sol` commands are not needed to derive the proof, it might indicate that the specification does not cover all behavior implemented in the smart contract. If some of the input `.spec` commands are not needed to derive the proof (typically unnecessary `require` statements or variable initializations), it indicates that the CVL rules/invariants can be made stronger.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add a glossary entry for unsat core in /docs/user-guide/glossary.md, which you can reference here with {term}`unsat core`.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I think about this more, I think maybe we shouldn't use the words "unsat core", since the user isn't required to know what "unsat" means. Maybe we can just call this "coverage visualization" or something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I have completely avoided the concept of unsat cores.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid using italic and bold inline markup.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please break long lines to 80 characters.


We visualize this *coverage* information in a dedicated HTML file: `zipOutput/Reports/UnsatCorevisualization.html`. Furthermore, we also visualize the unsat core coverage information on our `TAC` representation of the verification condition.
jar-ben marked this conversation as resolved.
Show resolved Hide resolved

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Users don't and shouldn't know what TAC is, and the term "verification condition" is probably pretty opaque too. I'm not even sure what to replace this with. Perhaps "we also visualize information about our internal representation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have completely removed the mentions about TAC and visualisation on TAC. Alex is now working on a dedicated TAC part of the documentation; I will add it there.

In the rest of this section, we provide a more detailed explanation of the concept of unsat cores and demonstrate it on an example.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like to include a contents block after the introduction of a multi-section page:

```{contents}
```

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I removed a lot of content (including sections), showing the contents does not feel right anymore


## Unsat cores

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would leave out most of this section.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

completely removed

The Certora Prover works as follows:
1. It takes as an input `.sol` and `.spec` files.
2. It compiles the `.sol` input into EVM bytecode.
3. For each single *property*, i.e. elementary rule/invariant, in the `.spec` file(s), it converts the property and the bytecode into a TAC program.
4. Each TAC program is then converted to an SMT formula such that the formula is unsatisfiable if and only if the property holds (i.e. cannot be violated).

The SMT formula is built from a set `A` of *assertions*, say `{A1, A2, ..., An}`, that are built over a set of *variables*. Intuitively, you can see the assertions as mathematical equations. The formula is *satisfiable* if there exists an assignment to the variables that satisfies all the assertions simultaneously. Otherwise, the formula is *unsatisfiable*. In practice, it is often the case that already a small subset of `A` is unsatisfiable. Especially, one can extract a *minimal unsatisfiable subset* `U` of `A`. The *minimality* here means that if you remove any assert from `U` then it becomes satisfiable, i.e., it is not a *minimum cardinality*. We call a minimal unsatisfiable subset of `A` an *unsat core* of `A`.
For example, assume that `A = {a, !a, !b, a || b, a || !b}` where `a` and `b` are Boolean variables.
There are two minimal unsatisfiable subsets (i.e. unsat cores) of this formula:
`{a, !a}` and `{!a, !b, a || b}`. Furthermore, note that `a || !b` is not contained in any of the minimal unsatisfiable subsets (i.e. it is irrelevant for the unsatisfiability of `A`).


There are several different kinds of assertions in `A` and some of them correspond to commands from the TAC program. In particular, for every `assign`, `assume` and `assert` command `Ci` from the TAC program, there is a corresponding assert `Ai` in `A`. Suppose we obtain an unsat core `U` of `A`. Then, the meaning of every excluded assert `Ai in (A - U)` is the following:


- If `Ci` is an `assume` or `assert` command then `Ci` can be completely removed from the TAC program without causing a violation of the underlying CVL property.
- If `Ci` is an `assign` command, for instance `R1 = R2 + R3`, then the right hand side of the equation can be `havoc'd` without violating the underlying CVL property, i.e. in our example we can replace `R1 = R2 + R3` by `R1 = havoc`.
alexandernutz marked this conversation as resolved.
Show resolved Hide resolved


Furthermore, we maintain a mapping between the commands in the TAC program, and the commands in the input `.sol` and `.spec` files. In particular, for `.spec` files, we usually have a mapping for a vast majority of the commands (CVL `assert`s, `require`s, and variable assignments). Unfortunately, for `.sol` files, the mapping is very limited due to the compilation to the bytecode (the compilers are not build by Certora and hence we cannot ensure preservation of mapping between commands in `.sol` and commands in the bytecode).



## Unsat core examples

### Tautology example
Assume the following CVL rule called `tautology`.
jar-ben marked this conversation as resolved.
Show resolved Hide resolved

```cvl
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to include this in the Examples repository and link to it so that people can run it themselves (and so we can rerun if the documentation needs to change).

rule tautology(uint256 fundId, method f) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a complex enough example that I'd consider turning on line numbers and referring to them in the text below (e.g. "we required other != manager on line 5").

To do so, I think the syntax is

```{code-block} cvl
:linenos:

rule tautology(...) {
 ...
}
```

env e;
address manager = getCurrentManager(fundId);
address other;
require other != manager && other != e.msg.sender;
calldataarg args;
f(e,args);
address newManager = getCurrentManager(fundId);
assert newManager!= other || newManager != manager;
}
```

We first call a solidity function `getCurrentManager(...)` to get the address of the current `manager` of the underlying smart contract. Subsequently, we create another address `other` and require that `other != manager && other != e.msg.sender`. Subsequently, we call a function `f(...)` of the contract, and, assuming this function call could have changed the manager, we get the value `newManager` of the current manager. Finally, we assert that either `newManager!= other` or `newManager != manager`. However, notice that we required that `other != manager` and hence the `assert` is necessarily `true`. The function calls of `getCurrentManager(...)` and `f(...)` are completely irrelevant.
jar-ben marked this conversation as resolved.
Show resolved Hide resolved

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also link to the glossary (using {term}`tautology`) here. I'm pretty sure that there's already entries from the docs for the tautology checker. You could check the entries to see if it makes sense to link back here as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a link to the glossary



### Visualization on the `.sol` and `.spec` files
jar-ben marked this conversation as resolved.
Show resolved Hide resolved

You can access the unsat cores visualization via `Job Info -> Unsat Core Page` buttons, as shown on Figure 1.

![](coverage-info-button.png)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's nice to include short descriptions of the images in the braces, I guess for accessibility reasons.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

Figure 1: The `.spec` & `.sol` visualization button.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just move this to the alt text. If you really want it to be caption, here's how: https://myst-parser.readthedocs.io/en/latest/syntax/images_and_figures.html#figures-images-with-captions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed


The visualization itself is shown on Figure 2. It consists of two *panes*:
1. The left pane shows *per-line* visualization.
2. The right pane shows detailed info about individual visualized lines.

![](tautology-sol-and-spec-new.png)
Figure 2: The visualization of the Tautology example on `.sol` and `.spec`.


In particular, in the left pane, we highlight lines from individual `.sol` and `.spec` files.
Every line can have either none, green, red or yellow background color. No background means we have no information about this line. For instance, it might not be neither `assert` nor `require` nor `assign` command, or we might not have the mapping due to compilation to the EVM bytecode. Green, red or yellow background means that we have a mapping to one or more commands on the line (one `.spec`/`.sol` line can be often broken down to several TAC commands) with the following meaning:
jar-ben marked this conversation as resolved.
Show resolved Hide resolved
jar-ben marked this conversation as resolved.
Show resolved Hide resolved
jar-ben marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't mention TAC - instead you can say that an individual line may perform multiple operations, only some of which are relevant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rephrased


1. Green means that all of the TAC commands that are mapped to the line are in the unsat core, i.e. needed to prove the property.
2. Red means that none of the TAC commands that are mapped to the line are in the unsat core, i.e. not needed to prove the property.
3. Yellow means that some of the TAC commands that are mapped to the line are in the unsat core and some of them are not in the unsat core.

Furthermore, if we have multiple rules/invariants or a parametric rule (such as our `tautology`), we can also have multiple rules/invariants mapping to a single `.sol` or `.spec` line. That is, we generate just a single, joint, visualization for all the rules/invariants (run with `--rule` and `--method` flags to get a visualization for a single rule/method/invariant). In such a case, a yellow line means that some of the commands on the line are needed to prove some of the rules/methods/invariants and some of the commands are not needed.
jar-ben marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think saying this two ways ("... that is ...") is confusing in this case. Instead, I might just say "the generated coverage information applies to the whole specification; if you want to generate coverage for a single rule or method, use {ref}`--rule` or {ref}`--method`.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I might say this earlier, so that you can just include the impact of this in the bullets describing the colors.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the references for --rule and --method are defined in #157

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated


The right pane provides detailed information about individual lines and commands mapped to the lines group by the rule/method/invariant name (denoted *rule*) and the *value*, where each *value* refers to a particular `assign`, `assert` or `assume` TAC command that is mapped to the line. For instance, in the Tautology example, we can see that sub-command `other != manager` of line `11` matters (is in the unsat core), whereas `other != e.msg.sender` is not in the unsat core. Also, note that the pane shows e.g. both `other != manager` and `other == manager`; this is because in our TAC representation we encode `other != manager` as `!(other == manager)`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No TAC

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated



### Visualization on TAC
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would drop this section. End users should never ever have to look at a TAC dump

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alex gave webinar just a few weeks ago about using our TAC dumps to better understand and resolve timeouts. No one is saying that the end users have to look at a TAC dump; but if they can, it can help them a lot. Moreover, there is even a button (for long time already) to the TAC dump from the HTML report.

  1. Perhaps we should open a broader discussion (including other folks from the company) about allowing/disallowing end users to work with TAC
  2. If we do not want to include TAC in the official documentation "for basic users", is there a place with documentation for "advanced users"?

Copy link
Contributor

@alexandernutz alexandernutz Oct 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I figure we haven't synced on this enough...
My understanding, e.g. with Nurit and Shelly, lately was that we want to cautiously expose Security Engineers, and advanced end users to the graphical TAC dumps (not 100% sure about those groups, perhaps we can define it better).
Shelly has been making them nicer for that reason, and I have for example based some of the timeout feedback on them.

I think we're all in agreement that it would be nicer to lift everything user-facing to spec / source level, but this is, among other things, a developer-time tradeoff (not necessarily an aspect we advertise, but still..).
Another aspect is that TAC is closer to the VC, so for instance when it comes to understanding an SMT timeout, it might be helpful to see "what the solver is seeing" (e.g. when someone thinks they've summarized all nonlinear away, but haven't, they can see it in TAC).

The plan, afaiu, was to communicate a very rough working understanding about the TAC dumps.
As of now, this mostly includes looking at

  • the control flow graph, including the external calls, which can be seen clearly in our graphs,
  • the "back pointers" which appear on hover over some tac statements and show source/spec locations.

We should probably have discussed / communicated this with you Mike in particular earlier -- I guess it didn't make the way across the ocean :-/ ...
(also not against a wider discussion, but I understand it's especially Mike's purview ..)
--> happy to discuss in any combination

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note also that I have an upcoming PR that adds a stub for that TAC dump documentation -- which will be followed up by a PR for that documentation itself ... (actually, I'll make a jira issue right now ..)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

In contrast to `.sol` and `.spec` visualization, we generate separate TAC unsat core visualization for individual rules/methods/invariants. To access it, click first on the particular rule/method/invariant in the `Rules` pane and then on the `Dump page` button as shown in Figure 3.

![](tac-visualization-button.png)
Figure 3: The TAC visualization button.

The visualization here consists only of 2 colors: `green` means that the command is needed for the proof and `red` means that the command is not needed (i.e. not in the unsat core). In particular, Figures 4a and 4b show the TAC visualization for the Tautology example.

![](tautology-tac-a-new.png)
Figure 4a: The visualization of the Tautology example on TAC, part A.

![](tautology-tac-b-new.png)
Figure 4b: The visualization of the Tautology example on TAC, part B.

Only the following TAC commands are in the unsat core:

```
→ require other != manager && other != e.msg.sender
I95 = other287 SPEC
I96 = manager269 SPEC
tmp292293 = I95==I96 SPEC
B97 = !tmp292293 SPEC
...
certoraAssume288289 = B97&&B102 SPEC
assume certoraAssume288289
...
→ assert newManager != other || newManager != manager
I158 = newManager419 SPEC
I159 = other287 SPEC
tmp441442 = I158==I159 SPEC
B160 = !tmp441442 SPEC
I161 = newManager419 SPEC
I162 = manager269 SPEC
tmp449450 = I161==I162 SPEC
B163 = !tmp449450 SPEC
certoraAssert437438 = B160||B163 SPEC
assert certoraAssert437438,
```

Also, notice that some commands in the TAC dump are followed by "SPEC" or "SOL" with a violet background color; these are commands for which we have a mapping to `.spec` and `.sol` files, respectively.

## Basic vs. advanced mode

The `--coverage_info` flag takes three possible values: `none`, `basic` and `advanced`:
1. `none` means no unsat core analysis,
2. `basic` means relatively fast bust possibly very imprecise analysis,
jar-ben marked this conversation as resolved.
Show resolved Hide resolved
3. and `advanced` means possibly slow but more precise analysis.

In particular, the Certora Prover internally does many various transformations of the TAC program before it converts the TAC program to SMT. These transformations usually simplify the TAC program and make the whole verification task easier and faster. However, some of these transformations might lead to loosing a mapping between the initial and the final TAC program, and consequently also loosing a mapping between the SMT encoding and the input `.sol` and `.spec` files. In such case, some lines/commands in the `.spec` and `.sol` visualization might be marked as not needed even if they are actually needed; the other direction, i.e. spurious red lines, is also possible. The `basic` mode keeps all the TAC transformations and hence is faster but possibly more imprecise, whereas the `advanced` mode turns off some of the transformations to make the analysis more precise.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would drop most of the detail in this paragraph and just tell the user what they need to know: the basic coverage mode is approximate and may miscolor some statements, while the advanced coverage mode takes a long time to run.

Copy link
Contributor

@alexandernutz alexandernutz Oct 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense to me ... it also makes me wonder more generally (also wrt my upcoming DOC PRs):
Is there really no place in the docs for such elaborations :-)? (question to you, Mike)

-- maybe we could have a box (like the "note" and "todo" boxes) that indicates background information that isn't directly related to using the tool, but gives a deeper understanding of what's happening behind the scenes (?)

(sorry for hijacking this PR review a bit -- we can always break out into a zoom or something if this becomes longer)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the information from here and added a small section ## Completeness


## Multiple unsat cores
In general, the SMT formula can have multiple unsat cores; in fact, even exponentially many w.r.t. the number of assertions in the formula. By default, we generate just a single unsat core per rule/method/invariant. Use can use the flag `--prover_args "-numOfUnsatCores <k>"` to instruct the Certora Prover to generate up `k` unsat cores for each rule/method/invariant. At this moment, we support `k > 1` only for the TAC visualizations. In particular,
jar-ben marked this conversation as resolved.
Show resolved Hide resolved
jar-ben marked this conversation as resolved.
Show resolved Hide resolved
the name of the TAC unsat core visualization file has the form
`Report-<rule_name>-<method_name>-unsatCore<i>.html` where `i >= 1` is the index of the unsat core, e.g. `Report-tautology-createFundLPU256RP-unsatCore1.html`. The `Dump page` button always leads to the first unsat core (`i == 1`). To access the additional unsat cores, you can either manually change the HTML address, or download the `zipOutput` as all the unsat core visualizations are stored in the `zipOutput/Reports` folder.
1 change: 1 addition & 0 deletions docs/prover/checking/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ tools that you can use to help find bugs in your specifications.
:caption: Contents

sanity.md
coverage-info.md
injection.md
mutation.md
```
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
jar-ben marked this conversation as resolved.
Show resolved Hide resolved
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/prover/checking/tautology-tac-a-new.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/prover/checking/tautology-tac-b-new.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
5 changes: 4 additions & 1 deletion spelling_wordlist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,7 @@ hyperproperties
hyperproperty
immutable
immutables
initializations
injective
injectivity
inlined
Expand Down Expand Up @@ -154,5 +155,7 @@ verifiers
walkthrough
whitespace
prover_args
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be in code font (which is not spell checked) instead of added to the word list.

UNSAT
unsat
unsatisfiability
UX
UNSAT