Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC-343: Unsat Cores Visualisation #153

Merged
merged 20 commits into from
Oct 10, 2024
Merged

DOC-343: Unsat Cores Visualisation #153

merged 20 commits into from
Oct 10, 2024

Conversation

jar-ben
Copy link
Contributor

@jar-ben jar-ben commented Oct 11, 2023

@jar-ben jar-ben added the existing feature new documentation for an existing feature label Oct 11, 2023
@jar-ben jar-ben requested a review from a team as a code owner October 11, 2023 12:40
Copy link
Contributor

@alexandernutz alexandernutz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a read-through..

  • looks good to me in general; the explanations and structure make sense to me
  • a few nits and questions

docs/prover/checking/coverage-info.md Outdated Show resolved Hide resolved
The SMT formula is built from a set `A` of *assertions*, say `{A1, A2, ..., An}`, that are built over a set of *variables*. Intuitively, you can see the assertions as mathematical equations. The formula is *satisfiable* if there exists an assignment to the variables that satisfies all the assertions simultaneously. Otherwise, the formula is *unsatisfiable*. In practice, it is often the case that already a small subset of `A` is unsatisfiable. Especially, one can extract a *minimal unsatisfiable subset* `U` of `A`. The *minimality* here means that if you remove any assert from `U` then it becomes satisfiable, i.e., it is not a *minimum cardinality*. We call a minimal unsatisfiable subset of `A` an *unsat core* of `A`.
For example, assume that `A = {a, !a, !b, a || b}` where `a` and `b` are Boolean variables.
There are two minimal unsatisfiable subsets (i.e. unsat cores) of this formula:
`{a, !a}` and `{!a, !b, a || b}`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth it to reference some science-y doc with a formal explanation? If it exists (that also frees you a bit (more) from the need to be super thorough perhaps...).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure whether it is a good idea. There are many academic papers I can reference, however, I am afraid it might be confusing (the papers are dealing with particular problems/applications/algorithms that are different than our.

I think (hope?) that the "definition" I provided is understandable. But I will update the example A = {a, !a, !b, a || b} to contain also an assert that is not in any unsat core! E.g. a || !b

Copy link
Contributor

@alexandernutz alexandernutz Oct 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking more of a script of some lecture or so, or perhaps even a book, if there is one ... (science-y is maybe not the best word ... uhm, didactic, maybe -- university-teaching-y ...)
but yeah, not saying yours isn't rigorous or understandable :-) -- just thought maybe that these two goals at the same time might make things harder than they needed to be here ...

docs/prover/checking/coverage-info.md Outdated Show resolved Hide resolved
docs/prover/checking/coverage-info.md Outdated Show resolved Hide resolved
docs/prover/checking/coverage-info.md Outdated Show resolved Hide resolved


### Visualization on TAC
On contrary to `.sol` and `.spec` visualization, we generate separate TAC unsat core visualization for individual rules/methods/invariants. To access it, click first on the particular rule/method/invariant in the `Rules` pane and then on the `Dump page` button as shown in Figure 3.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"On contrary to" ~> "In contrast to"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"rules/methods/invariants"
I think we should think about some terminology here, globally -- maybe @mdgeorge4153 already has one?
--> what's the thing that corresponds to a single TAC program / verification condition?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: in my alex/timeouts-update branch I'm adding a file docs/prover/diagnosis/index.md, which has a stub for the explanation of TAC reports
--> you could cherry pick and link here (link by [link text](tac-reports) )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you expect your branch will not get merged sooner than mine? :) I will probably wait and update it later. Moreover, I guess there are many places in our documentation that could enjoy a link to the TAC reports explanation, so perhaps we can do a PR just for that when you merge it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as you like, but yours is closer to completion -- mine is ready for review "soon", I hope

![](tac-visualization-button.png)
Figure 3: The TAC visualization button.

The visualization here consists only of 2 colors: `green` means that the command is needed and `red` means that the command is not needed (i.e. not in the unsat core). In particular, Figures 4a and 4b show the TAC visualization for the Tautology example.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps "needed" ~> "needed for the proof" in the first instance

docs/prover/checking/coverage-info.md Outdated Show resolved Hide resolved
docs/prover/checking/coverage-info.md Outdated Show resolved Hide resolved
docs/prover/checking/coverage-info.md Outdated Show resolved Hide resolved
Copy link
Contributor

@mdgeorge4153 mdgeorge4153 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good draft, but it contains a lot of information that isn't appropriate for our end users.

docs/prover/checking/coverage-info.md Outdated Show resolved Hide resolved
docs/prover/checking/coverage-info.md Outdated Show resolved Hide resolved
docs/prover/checking/coverage-info.md Outdated Show resolved Hide resolved
docs/prover/checking/coverage-info.md Outdated Show resolved Hide resolved
docs/prover/checking/coverage-info.md Outdated Show resolved Hide resolved

Furthermore, if we have multiple rules/invariants or a parametric rule (such as our `tautology`), we can also have multiple rules/invariants mapping to a single `.sol` or `.spec` line. That is, we generate just a single, joint, visualization for all the rules/invariants (run with `--rule` and `--method` flags to get a visualization for a single rule/method/invariant). In such a case, a yellow line means that some of the commands on the line are needed to prove some of the rules/methods/invariants and some of the commands are not needed.

The right pane provides detailed information about individual lines and commands mapped to the lines group by the rule/method/invariant name (denoted *rule*) and the *value*, where each *value* refers to a particular `assign`, `assert` or `assume` TAC command that is mapped to the line. For instance, in the Tautology example, we can see that sub-command `other != manager` of line `11` matters (is in the unsat core), whereas `other != e.msg.sender` is not in the unsat core. Also, note that the pane shows e.g. both `other != manager` and `other == manager`; this is because in our TAC representation we encode `other != manager` as `!(other == manager)`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No TAC

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

The right pane provides detailed information about individual lines and commands mapped to the lines group by the rule/method/invariant name (denoted *rule*) and the *value*, where each *value* refers to a particular `assign`, `assert` or `assume` TAC command that is mapped to the line. For instance, in the Tautology example, we can see that sub-command `other != manager` of line `11` matters (is in the unsat core), whereas `other != e.msg.sender` is not in the unsat core. Also, note that the pane shows e.g. both `other != manager` and `other == manager`; this is because in our TAC representation we encode `other != manager` as `!(other == manager)`.


### Visualization on TAC
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would drop this section. End users should never ever have to look at a TAC dump

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alex gave webinar just a few weeks ago about using our TAC dumps to better understand and resolve timeouts. No one is saying that the end users have to look at a TAC dump; but if they can, it can help them a lot. Moreover, there is even a button (for long time already) to the TAC dump from the HTML report.

  1. Perhaps we should open a broader discussion (including other folks from the company) about allowing/disallowing end users to work with TAC
  2. If we do not want to include TAC in the official documentation "for basic users", is there a place with documentation for "advanced users"?

Copy link
Contributor

@alexandernutz alexandernutz Oct 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I figure we haven't synced on this enough...
My understanding, e.g. with Nurit and Shelly, lately was that we want to cautiously expose Security Engineers, and advanced end users to the graphical TAC dumps (not 100% sure about those groups, perhaps we can define it better).
Shelly has been making them nicer for that reason, and I have for example based some of the timeout feedback on them.

I think we're all in agreement that it would be nicer to lift everything user-facing to spec / source level, but this is, among other things, a developer-time tradeoff (not necessarily an aspect we advertise, but still..).
Another aspect is that TAC is closer to the VC, so for instance when it comes to understanding an SMT timeout, it might be helpful to see "what the solver is seeing" (e.g. when someone thinks they've summarized all nonlinear away, but haven't, they can see it in TAC).

The plan, afaiu, was to communicate a very rough working understanding about the TAC dumps.
As of now, this mostly includes looking at

  • the control flow graph, including the external calls, which can be seen clearly in our graphs,
  • the "back pointers" which appear on hover over some tac statements and show source/spec locations.

We should probably have discussed / communicated this with you Mike in particular earlier -- I guess it didn't make the way across the ocean :-/ ...
(also not against a wider discussion, but I understand it's especially Mike's purview ..)
--> happy to discuss in any combination

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note also that I have an upcoming PR that adds a stub for that TAC dump documentation -- which will be followed up by a PR for that documentation itself ... (actually, I'll make a jira issue right now ..)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

### Tautology example
Assume the following CVL rule called `tautology`.

```cvl
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to include this in the Examples repository and link to it so that people can run it themselves (and so we can rerun if the documentation needs to change).

2. `basic` means relatively fast bust possibly very imprecise analysis,
3. and `advanced` means possibly slow but more precise analysis.

In particular, the Certora Prover internally does many various transformations of the TAC program before it converts the TAC program to SMT. These transformations usually simplify the TAC program and make the whole verification task easier and faster. However, some of these transformations might lead to loosing a mapping between the initial and the final TAC program, and consequently also loosing a mapping between the SMT encoding and the input `.sol` and `.spec` files. In such case, some lines/commands in the `.spec` and `.sol` visualization might be marked as not needed even if they are actually needed; the other direction, i.e. spurious red lines, is also possible. The `basic` mode keeps all the TAC transformations and hence is faster but possibly more imprecise, whereas the `advanced` mode turns off some of the transformations to make the analysis more precise.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would drop most of the detail in this paragraph and just tell the user what they need to know: the basic coverage mode is approximate and may miscolor some statements, while the advanced coverage mode takes a long time to run.

Copy link
Contributor

@alexandernutz alexandernutz Oct 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense to me ... it also makes me wonder more generally (also wrt my upcoming DOC PRs):
Is there really no place in the docs for such elaborations :-)? (question to you, Mike)

-- maybe we could have a box (like the "note" and "todo" boxes) that indicates background information that isn't directly related to using the tool, but gives a deeper understanding of what's happening behind the scenes (?)

(sorry for hijacking this PR review a bit -- we can always break out into a zoom or something if this becomes longer)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the information from here and added a small section ## Completeness

docs/prover/checking/coverage-info.md Outdated Show resolved Hide resolved
Copy link
Contributor

@mdgeorge4153 mdgeorge4153 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! A few minor edits but it is very nicely done.

Comment on lines +18 to +20
the CVL properties. If some of the input `.sol` commands are not relevant for
deriving the proof, it might indicate that the specification does not cover all
behavior implemented in the smart contract. If some of the input `.spec`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd generally refer to .sol files or Solidity commands but not .sol commands. Similarly ".spec commands" should be "CVL commands"

Consider the following CVL rule called `tautology`:

```cvl
rule tautology(uint256 fundId, method f) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a complex enough example that I'd consider turning on line numbers and referring to them in the text below (e.g. "we required other != manager on line 5").

To do so, I think the syntax is

```{code-block} cvl
:linenos:

rule tautology(...) {
 ...
}
```

```

Notice that we required that `other != manager` and hence the `assert` is
necessarily `true` (see the concept of a {term}`tautology`). The executions of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
necessarily `true` (see the concept of a {term}`tautology`). The executions of
necessarily `true`, so the rule is a {term}`tautology`. The executions of


The right pane provides detailed information about individual values mapped to
the lines grouped by the rule/method/invariant name (denoted *rule*). For
instance, in the Tautology example, we can see that the value of the command
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

either "tautology example" (lowercase) or "tautology example" (naming the rule).


The right pane provides detailed information about individual values mapped to
the lines grouped by the rule/method/invariant name (denoted *rule*). For
instance, in the Tautology example, we can see that the value of the command
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
instance, in the Tautology example, we can see that the value of the command
instance, in the Tautology example, we can see that the value of the expression

the analysis to the `.sol` and `.spec` files. Unfortunately, due to the
compilation of the solidity code to EVM bytecode, we cannot maintain a complete
mapping between commands from solidity and commands in our SMT representation.
Consequently, the visualization on `.sol` files can be very limited.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth an extra sentence saying how it is limited. Can it fail to highlight a relevant line, highlight an irrelevant line, both, something else?

1. The left pane shows *per-line* visualization.
2. The right pane shows detailed info about individual visualized lines.

![Example Coverage Info Visualization](tautology-sol-and-spec-cropped.png)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would consider shrinking down the left column more so that the entire image can be a little larger; the text is hard to read. This isn't necessary though because I don't think the text really requires the user to read the code.

@@ -159,3 +161,8 @@ verifier
verifiers
walkthrough
whitespace
prover_args
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be in code font (which is not spell checked) instead of added to the word list.

@@ -106,6 +107,7 @@ precompiled
preprocess
preprocessing
prestate
prover
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prover should always be capitalized (and will pass spell checking if it is)

UNSAT
unsat
unsatisfiability
UX
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you should need UX?

@urikirsh
Copy link
Contributor

urikirsh commented Sep 4, 2024

Going over the stale open doc PRs. @jar-ben is this still relevant? If not, can you please close it?

@jar-ben
Copy link
Contributor Author

jar-ben commented Sep 4, 2024

It is still relevant.

@urikirsh
Copy link
Contributor

urikirsh commented Sep 4, 2024

@alexandernutz @dominik-velan can you please review?
Thanks!

Copy link

@dominik-velan dominik-velan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

There is just one thing that might be included - the usage of the unsat cores when the job times out (I guess that feature did not exist when this PR was created).

@urikirsh
Copy link
Contributor

urikirsh commented Oct 6, 2024

Can I merge this PR?

@jar-ben
Copy link
Contributor Author

jar-ben commented Oct 7, 2024

Let me read it one more time :)

@jar-ben jar-ben merged commit b13e083 into master Oct 10, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
existing feature new documentation for an existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants