Fix: Error wording closer to semantics #676

MikaelMayer · 2023-01-06T22:02:57Z

This PR is the companion of dafny-lang/dafny#3324
We established numerous times this year that the wording "might not hold" still incorrectly provides the feeling that the postcondition is incorrect, when the real problem is often that some proof hints are missings, which is rather what we want our users to focus on.
So, rather than focusing on whether assertions hold or not, this PR fixes the message by focusing on the fact that the verifier was not able to prove the assertion. Proof now becomes a first-class goal.

There was the choice of starting all errors by "Could not prove", but then the first 3 words of every error message are never informative, which is not very friendly. The passive voice solves this problem and is less of a disruption in the errors messages.

This PR is the companion of dafny-lang/dafny#3324 We established numerous times this year that the wording "might not hold" still incorrectly provides the feeling that the postcondition is incorrect. Rather than focusing on whether assertions hold or not, this PR fixes the status by focusing on the fact that the verifier was not able to prove the assertion.

rakamaric · 2023-01-06T22:14:01Z

Source/Core/ProofObligationDescription.cs

@@ -32,75 +32,75 @@ public abstract class ProofObligationDescription {

 public class AssertionDescription : ProofObligationDescription
 {
-  public override string SuccessDescription => "This assertion holds.";
+  public override string SuccessDescription => "this assertion holds";


If we are going after "proven" to stress out this is about proofs, then why not also "proven" instead of "holds"?
Like "Boogie proved this assertion".

Same goes for other such patterns.

That's a good point. Note that the success message is for now only displayed in rare occasions, in some toolings and while hovering Dafny programs in VSCode. Since there is no call to action, "holds" is the same as "was proven", but it's shorter.

rakamaric · 2023-01-06T22:15:26Z

Source/UnitTests/ExecutionEngineTests/ExecutionEngineTest.cs

 Execution trace:
    fakeFilename1(3,3): anon0
-fakeFilename1(8,3): Error: This assertion might not hold.
+fakeFilename1(8,3): Error: this assertion could not be proven


I am also wondering if active voice would be nicer in all of these. So something like: "Boogie could not prove this assertion". Personally, I would prefer such active voice to current passive voice.

Me too, but after looking at the error messages, the first 4 words of "Boogie could not prove " are useless to know what the error is about, whereas by keeping this assertion as the subject, it's much clearer what's in the focus there.

atomb · 2023-01-06T22:16:25Z

This looks good overall, but it looks like the tests still need to be updated to account for the output changes.

rakamaric · 2023-01-06T22:23:12Z

As a side note, this will be a breaking change for (some) downstream tools. For example, SMACK relies on these messages to figure out the verification status. Not that we should not make this update, but just wanted to mention this.

MikaelMayer · 2023-01-06T22:36:17Z

As a side note, this will be a breaking change for (some) downstream tools. For example, SMACK relies on these messages to figure out the verification status. Not that we should not make this update, but just wanted to mention this.

Indeed, this is also a breaking change for Dafny. The only reason I'm pushing for this is that last year we went from "assertion violation" to "assertion might not hold", which was already a plus (and we did not get push back), yet, new users are still puzzled and first try to figure out why their assertion is wrong when all they need is figure out how to actually prove it.
So I think this change is going to be extremely useful to avoid this problem.

shazqadeer · 2023-01-07T19:18:28Z

@MikaelMayer : I don't quite see how changing the words from "does not hold" to "could not prove" or other such variations will actually help a Boogie or Dafny user decide on the next step. I do agree that "could not prove" is more accurate so I do not object to making this change.

bkragl · 2023-01-10T12:53:41Z

Bikeshedding here, but I think that "proved" is better than "proven" (although both are considered correct). We could consult a particular style guide, but let's use metrics instead: one less syllable to pronounce and more Google hits :)

MikaelMayer · 2023-01-10T22:04:09Z

In the last commit, I'm fixing the last two CI errors. One is puzzling to me.
9eb5120
An error message is displayed 10 lines earlier on another assertion.
I don't understand how this is possible because I only changed error messages. I'm investigating though.

MikaelMayer · 2023-01-10T22:07:27Z

In the last commit, I'm fixing the last two CI errors. One is puzzling to me. 9eb5120 An error message is displayed 10 lines earlier on another assertion. I don't understand how this is possible because I only changed error messages. I'm investigating though.

Ok it looks like the test is a parallel test, so the error isn't one, it's more a stability one. It looks like it's deterministic though so I think changing the test output makes sense.

atomb · 2023-01-11T23:16:03Z

Source/Core/ProofObligationDescription.cs


  public override string FailureDescription =>
-    "This loop invariant might not be maintained by the loop.";
+    "this loop invariant might not be maintained by the loop";


This message still contains the "might not" phrasing, as opposed to "could not be proved". Unfortunately, using that wording makes it pretty awkward: "this loop invariant could not be proved to be maintained by the loop".

What about: "this invariant could not be proved within the loop"?
Or, a shorter version of yours, "this invariant could not be proved to be maintained by the loop"

In the current PR, I used the second one.

Yeah, I like that one, too.

atomb · 2023-01-11T23:17:15Z

It looks like there are two test files remaining to update.

atomb

Looks good!

keyboardDrummer · 2023-01-13T14:01:31Z

Looks good to me too but let's get a few more approvals on this

MikaelMayer · 2023-01-13T22:45:41Z

@MikaelMayer : I don't quite see how changing the words from "does not hold" to "could not prove" or other such variations will actually help a Boogie or Dafny user decide on the next step. I do agree that "could not prove" is more accurate so I do not object to making this change.

I'm not sure a lot about Boogie users, but I can tell you more about the Dafny user experience. As you outlined "could not prove" is more accurate, and that's already the first step.
What happened with new Dafny users is that, as soon as they encountered "might not hold", they immediately understood "is probably wrong". This has several possible causes:

Users without verification background usually don't know what "false negative" is.
In most compiler tools used by software engineers, if an error is reported, it is accepted that the user needs to fix something, that there is something intrinsically wrong with their code, they read the doc and the error, and they fix it. Their code was broken, now it can be compiled.
Even type annotations are perceived as a necessity in most languages that do not have good type inference, and error messages are clear "X is used as Y but its inferred type was Z.", so users know that they need to add type annotation and even fix their code, no big deal.

Proofs are very much like type annotations, except that the type-checking system is incomplete.
So imagine, if users were presented an error on a type annotation (expression: X) like this

"The provided type X of expression might be wrong"

The first idea users would have is that, since the compiler says so, : X is the wrong annotation and the type of the expression is different. But it wouldn't come to mind that "might" indicates that it's actually correct and it requires hints, because adding hints is not something users do for type checking besides obvious places.

So when we provide an error message like "might not hold", we face the same problem. As users aren't used to provide hints, they might erroneously adhere to the belief that the compiler is clever and just being nice and that something is indeed wrong.

With "could not prove", not only it is more accurate, but the subject of the "prove" is Dafny itself. It says nothing about the proposition being false, only that it could not find a proof. This is the implicit invitation we want Dafny users to receive.

Hope it helps ! Thanks for the review.

shazqadeer · 2023-01-14T17:33:00Z

@MikaelMayer : I appreciate the detailed explanation. Thanks.

MikaelMayer requested a review from atomb January 6, 2023 22:02

MikaelMayer mentioned this pull request Jan 6, 2023

Fix: Wording of assertion failure closer to semantics dafny-lang/dafny#3324

Open

rakamaric reviewed Jan 6, 2023

View reviewed changes

Fixed tests

a402443

Merge branch 'master' into fix-3216-could-not-prove

c681d76

MikaelMayer added 6 commits January 10, 2023 09:18

proven => proved. Fixed tests

d2e15a3

Fixed one more en->ed

7641d24

Fixed a test

1dbd53a

Fixed all the boogie files with messages inline

7f0193a

Fixed 3 more tests

32600c9

Fixed last 2 CI errors

9eb5120

MikaelMayer requested a review from rakamaric January 10, 2023 22:29

Merge branch 'master' into fix-3216-could-not-prove

0a287f9

atomb reviewed Jan 11, 2023

View reviewed changes

MikaelMayer added 2 commits January 12, 2023 09:19

Support for the last two CI tests

11c0598

Wording for invariants as well

6fbfafd

atomb approved these changes Jan 12, 2023

View reviewed changes

Merge branch 'master' into fix-3216-could-not-prove

9ae9e3c

MikaelMayer requested review from shazqadeer and bkragl January 13, 2023 15:18

shazqadeer approved these changes Jan 13, 2023

View reviewed changes

Merge branch 'master' into fix-3216-could-not-prove

ee8cb55

shazqadeer merged commit 4166fbb into master Jan 14, 2023

MikaelMayer deleted the fix-3216-could-not-prove branch January 17, 2023 15:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: Error wording closer to semantics #676

Fix: Error wording closer to semantics #676

MikaelMayer commented Jan 6, 2023

rakamaric Jan 6, 2023

rakamaric Jan 6, 2023

MikaelMayer Jan 6, 2023

rakamaric Jan 6, 2023

MikaelMayer Jan 6, 2023

atomb commented Jan 6, 2023

rakamaric commented Jan 6, 2023

MikaelMayer commented Jan 6, 2023

shazqadeer commented Jan 7, 2023

bkragl commented Jan 10, 2023

MikaelMayer commented Jan 10, 2023

MikaelMayer commented Jan 10, 2023

atomb Jan 11, 2023

MikaelMayer Jan 12, 2023

MikaelMayer Jan 12, 2023

atomb Jan 12, 2023

atomb commented Jan 11, 2023

atomb left a comment

keyboardDrummer commented Jan 13, 2023

MikaelMayer commented Jan 13, 2023

shazqadeer commented Jan 14, 2023

Fix: Error wording closer to semantics #676

Fix: Error wording closer to semantics #676

Conversation

MikaelMayer commented Jan 6, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

atomb commented Jan 6, 2023

rakamaric commented Jan 6, 2023

MikaelMayer commented Jan 6, 2023

shazqadeer commented Jan 7, 2023

bkragl commented Jan 10, 2023

MikaelMayer commented Jan 10, 2023

MikaelMayer commented Jan 10, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

atomb commented Jan 11, 2023

atomb left a comment

Choose a reason for hiding this comment

keyboardDrummer commented Jan 13, 2023

MikaelMayer commented Jan 13, 2023

shazqadeer commented Jan 14, 2023