-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC-343: Unsat Cores Visualisation #153
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had a read-through..
- looks good to me in general; the explanations and structure make sense to me
- a few nits and questions
The SMT formula is built from a set `A` of *assertions*, say `{A1, A2, ..., An}`, that are built over a set of *variables*. Intuitively, you can see the assertions as mathematical equations. The formula is *satisfiable* if there exists an assignment to the variables that satisfies all the assertions simultaneously. Otherwise, the formula is *unsatisfiable*. In practice, it is often the case that already a small subset of `A` is unsatisfiable. Especially, one can extract a *minimal unsatisfiable subset* `U` of `A`. The *minimality* here means that if you remove any assert from `U` then it becomes satisfiable, i.e., it is not a *minimum cardinality*. We call a minimal unsatisfiable subset of `A` an *unsat core* of `A`. | ||
For example, assume that `A = {a, !a, !b, a || b}` where `a` and `b` are Boolean variables. | ||
There are two minimal unsatisfiable subsets (i.e. unsat cores) of this formula: | ||
`{a, !a}` and `{!a, !b, a || b}`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth it to reference some science-y doc with a formal explanation? If it exists (that also frees you a bit (more) from the need to be super thorough perhaps...).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure whether it is a good idea. There are many academic papers I can reference, however, I am afraid it might be confusing (the papers are dealing with particular problems/applications/algorithms that are different than our.
I think (hope?) that the "definition" I provided is understandable. But I will update the example A = {a, !a, !b, a || b}
to contain also an assert that is not in any unsat core! E.g. a || !b
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking more of a script of some lecture or so, or perhaps even a book, if there is one ... (science-y is maybe not the best word ... uhm, didactic, maybe -- university-teaching-y ...)
but yeah, not saying yours isn't rigorous or understandable :-) -- just thought maybe that these two goals at the same time might make things harder than they needed to be here ...
|
||
|
||
### Visualization on TAC | ||
On contrary to `.sol` and `.spec` visualization, we generate separate TAC unsat core visualization for individual rules/methods/invariants. To access it, click first on the particular rule/method/invariant in the `Rules` pane and then on the `Dump page` button as shown in Figure 3. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"On contrary to" ~> "In contrast to"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"rules/methods/invariants"
I think we should think about some terminology here, globally -- maybe @mdgeorge4153 already has one?
--> what's the thing that corresponds to a single TAC program / verification condition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: in my alex/timeouts-update
branch I'm adding a file docs/prover/diagnosis/index.md
, which has a stub for the explanation of TAC reports
--> you could cherry pick and link here (link by [link text](tac-reports)
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you expect your branch will not get merged sooner than mine? :) I will probably wait and update it later. Moreover, I guess there are many places in our documentation that could enjoy a link to the TAC reports explanation, so perhaps we can do a PR just for that when you merge it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as you like, but yours is closer to completion -- mine is ready for review "soon", I hope
![](tac-visualization-button.png) | ||
Figure 3: The TAC visualization button. | ||
|
||
The visualization here consists only of 2 colors: `green` means that the command is needed and `red` means that the command is not needed (i.e. not in the unsat core). In particular, Figures 4a and 4b show the TAC visualization for the Tautology example. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps "needed" ~> "needed for the proof" in the first instance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good draft, but it contains a lot of information that isn't appropriate for our end users.
|
||
Furthermore, if we have multiple rules/invariants or a parametric rule (such as our `tautology`), we can also have multiple rules/invariants mapping to a single `.sol` or `.spec` line. That is, we generate just a single, joint, visualization for all the rules/invariants (run with `--rule` and `--method` flags to get a visualization for a single rule/method/invariant). In such a case, a yellow line means that some of the commands on the line are needed to prove some of the rules/methods/invariants and some of the commands are not needed. | ||
|
||
The right pane provides detailed information about individual lines and commands mapped to the lines group by the rule/method/invariant name (denoted *rule*) and the *value*, where each *value* refers to a particular `assign`, `assert` or `assume` TAC command that is mapped to the line. For instance, in the Tautology example, we can see that sub-command `other != manager` of line `11` matters (is in the unsat core), whereas `other != e.msg.sender` is not in the unsat core. Also, note that the pane shows e.g. both `other != manager` and `other == manager`; this is because in our TAC representation we encode `other != manager` as `!(other == manager)`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No TAC
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
The right pane provides detailed information about individual lines and commands mapped to the lines group by the rule/method/invariant name (denoted *rule*) and the *value*, where each *value* refers to a particular `assign`, `assert` or `assume` TAC command that is mapped to the line. For instance, in the Tautology example, we can see that sub-command `other != manager` of line `11` matters (is in the unsat core), whereas `other != e.msg.sender` is not in the unsat core. Also, note that the pane shows e.g. both `other != manager` and `other == manager`; this is because in our TAC representation we encode `other != manager` as `!(other == manager)`. | ||
|
||
|
||
### Visualization on TAC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would drop this section. End users should never ever have to look at a TAC dump
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alex gave webinar just a few weeks ago about using our TAC dumps to better understand and resolve timeouts. No one is saying that the end users have to look at a TAC dump; but if they can, it can help them a lot. Moreover, there is even a button (for long time already) to the TAC dump from the HTML report.
- Perhaps we should open a broader discussion (including other folks from the company) about allowing/disallowing end users to work with TAC
- If we do not want to include TAC in the official documentation "for basic users", is there a place with documentation for "advanced users"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I figure we haven't synced on this enough...
My understanding, e.g. with Nurit and Shelly, lately was that we want to cautiously expose Security Engineers, and advanced end users to the graphical TAC dumps (not 100% sure about those groups, perhaps we can define it better).
Shelly has been making them nicer for that reason, and I have for example based some of the timeout feedback on them.
I think we're all in agreement that it would be nicer to lift everything user-facing to spec / source level, but this is, among other things, a developer-time tradeoff (not necessarily an aspect we advertise, but still..).
Another aspect is that TAC is closer to the VC, so for instance when it comes to understanding an SMT timeout, it might be helpful to see "what the solver is seeing" (e.g. when someone thinks they've summarized all nonlinear away, but haven't, they can see it in TAC).
The plan, afaiu, was to communicate a very rough working understanding about the TAC dumps.
As of now, this mostly includes looking at
- the control flow graph, including the external calls, which can be seen clearly in our graphs,
- the "back pointers" which appear on hover over some tac statements and show source/spec locations.
We should probably have discussed / communicated this with you Mike in particular earlier -- I guess it didn't make the way across the ocean :-/ ...
(also not against a wider discussion, but I understand it's especially Mike's purview ..)
--> happy to discuss in any combination
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note also that I have an upcoming PR that adds a stub for that TAC dump documentation -- which will be followed up by a PR for that documentation itself ... (actually, I'll make a jira issue right now ..)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
### Tautology example | ||
Assume the following CVL rule called `tautology`. | ||
|
||
```cvl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to include this in the Examples
repository and link to it so that people can run it themselves (and so we can rerun if the documentation needs to change).
2. `basic` means relatively fast bust possibly very imprecise analysis, | ||
3. and `advanced` means possibly slow but more precise analysis. | ||
|
||
In particular, the Certora Prover internally does many various transformations of the TAC program before it converts the TAC program to SMT. These transformations usually simplify the TAC program and make the whole verification task easier and faster. However, some of these transformations might lead to loosing a mapping between the initial and the final TAC program, and consequently also loosing a mapping between the SMT encoding and the input `.sol` and `.spec` files. In such case, some lines/commands in the `.spec` and `.sol` visualization might be marked as not needed even if they are actually needed; the other direction, i.e. spurious red lines, is also possible. The `basic` mode keeps all the TAC transformations and hence is faster but possibly more imprecise, whereas the `advanced` mode turns off some of the transformations to make the analysis more precise. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would drop most of the detail in this paragraph and just tell the user what they need to know: the basic coverage mode is approximate and may miscolor some statements, while the advanced coverage mode takes a long time to run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense to me ... it also makes me wonder more generally (also wrt my upcoming DOC PRs):
Is there really no place in the docs for such elaborations :-)? (question to you, Mike)
-- maybe we could have a box (like the "note" and "todo" boxes) that indicates background information that isn't directly related to using the tool, but gives a deeper understanding of what's happening behind the scenes (?)
(sorry for hijacking this PR review a bit -- we can always break out into a zoom or something if this becomes longer)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the information from here and added a small section ## Completeness
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great! A few minor edits but it is very nicely done.
the CVL properties. If some of the input `.sol` commands are not relevant for | ||
deriving the proof, it might indicate that the specification does not cover all | ||
behavior implemented in the smart contract. If some of the input `.spec` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd generally refer to .sol
files or Solidity commands but not .sol
commands. Similarly ".spec
commands" should be "CVL commands"
Consider the following CVL rule called `tautology`: | ||
|
||
```cvl | ||
rule tautology(uint256 fundId, method f) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a complex enough example that I'd consider turning on line numbers and referring to them in the text below (e.g. "we required other != manager
on line 5").
To do so, I think the syntax is
```{code-block} cvl
:linenos:
rule tautology(...) {
...
}
```
``` | ||
|
||
Notice that we required that `other != manager` and hence the `assert` is | ||
necessarily `true` (see the concept of a {term}`tautology`). The executions of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
necessarily `true` (see the concept of a {term}`tautology`). The executions of | |
necessarily `true`, so the rule is a {term}`tautology`. The executions of |
|
||
The right pane provides detailed information about individual values mapped to | ||
the lines grouped by the rule/method/invariant name (denoted *rule*). For | ||
instance, in the Tautology example, we can see that the value of the command |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
either "tautology example" (lowercase) or "tautology
example" (naming the rule).
|
||
The right pane provides detailed information about individual values mapped to | ||
the lines grouped by the rule/method/invariant name (denoted *rule*). For | ||
instance, in the Tautology example, we can see that the value of the command |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instance, in the Tautology example, we can see that the value of the command | |
instance, in the Tautology example, we can see that the value of the expression |
the analysis to the `.sol` and `.spec` files. Unfortunately, due to the | ||
compilation of the solidity code to EVM bytecode, we cannot maintain a complete | ||
mapping between commands from solidity and commands in our SMT representation. | ||
Consequently, the visualization on `.sol` files can be very limited. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be worth an extra sentence saying how it is limited. Can it fail to highlight a relevant line, highlight an irrelevant line, both, something else?
1. The left pane shows *per-line* visualization. | ||
2. The right pane shows detailed info about individual visualized lines. | ||
|
||
![Example Coverage Info Visualization](tautology-sol-and-spec-cropped.png) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would consider shrinking down the left column more so that the entire image can be a little larger; the text is hard to read. This isn't necessary though because I don't think the text really requires the user to read the code.
spelling_wordlist.txt
Outdated
@@ -159,3 +161,8 @@ verifier | |||
verifiers | |||
walkthrough | |||
whitespace | |||
prover_args |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be in code font (which is not spell checked) instead of added to the word list.
spelling_wordlist.txt
Outdated
@@ -106,6 +107,7 @@ precompiled | |||
preprocess | |||
preprocessing | |||
prestate | |||
prover |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prover should always be capitalized (and will pass spell checking if it is)
spelling_wordlist.txt
Outdated
UNSAT | ||
unsat | ||
unsatisfiability | ||
UX |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you should need UX?
Going over the stale open doc PRs. @jar-ben is this still relevant? If not, can you please close it? |
It is still relevant. |
@alexandernutz @dominik-velan can you please review? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There is just one thing that might be included - the usage of the unsat cores when the job times out (I guess that feature did not exist when this PR was created).
Can I merge this PR? |
Let me read it one more time :) |
Jira ticket: https://certora.atlassian.net/browse/DOC-343
Link to generated documentation: https://certora-certora-prover-documentation--153.com.readthedocs.build/en/153/docs/prover/checking/coverage-info.html