-
Notifications
You must be signed in to change notification settings - Fork 269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CONTRACTS: allow pointer predicates to fail in assume
contexts
#8562
base: develop
Are you sure you want to change the base?
CONTRACTS: allow pointer predicates to fail in assume
contexts
#8562
Conversation
regression/contracts-dfcc/test_is_fresh_enforce_requires_disjunction_pass/test.desc
Outdated
Show resolved
Hide resolved
@tautschnig @qinheping I added more regression tests to cover contract replacement mode and the pointer_in_range_dfcc predicate. This one still worked as expected. I think one of the limitations of our regression testing process that allowed this bug to go unnoticed for so long is that we do not collect coverage metrics for the CPROVER library code. The code we add in the CPROVER library is only symbolically executed by CBMC when analyzing a GOTO model and we do not collect rechability/coverage metrics for that type of instrumentation code. |
The non-determinism causes a big performance regression on |
I've narrowed the performance regression to this. We start from this function contract and loop contract for a memcpy function : void foo(char *dst, const char *src, size_t n)
// clang-format off
__CPROVER_requires(__CPROVER_is_fresh(src, n))
__CPROVER_requires(__CPROVER_is_fresh(dst, n))
__CPROVER_assigns(__CPROVER_object_from(dst))
__CPROVER_ensures(__CPROVER_forall {
size_t j;
j < n ==> dst[j] == src[j]
})
// clang-format on
{
for(size_t i = 0; i < n; i++)
// clang-format off
__CPROVER_assigns(i, __CPROVER_object_from(dst))
__CPROVER_loop_invariant(i <= n)
__CPROVER_loop_invariant(
__CPROVER_forall { size_t j; j < i ==> dst[j] == src[j] })
// clang-format on
{
dst[i] = src[i];
}
} With the changes introduced in this PR char *src;
bool allocate_src = nondet_bool();
if(allocate_src)
{
src = malloc(n);
__CPROVER_assume(src);
}
__CPROVER_assume(allocate_src);
// value_set(src) = {INVALID, &some_object} with invalid unreachable Whereas before it would always succeed and be encoded as char *src;
bool allocate_src = true;
if(allocate_src)
{
src = malloc(n);
__CPROVER_assume(src);
}
__CPROVER_assume(allocate_src);
// value_set(src) = {&some_object} With nondeterminism + assumption the runtime blows up, with the deterministic allocation, the model is problem is solved instantly. I think the problem might be that in the non-deterministic case, the value set for Interestingly, if I default-initialize char *src = NULL;
bool allocate_src = true;
if(allocate_src)
{
src = malloc(n);
__CPROVER_assume(src);
}
__CPROVER_assume(allocate_src);
// value_set(src) = {NULL, &some_object} with NULL unreachable @tautschnig is there any way we could prune value sets for pointers based on assumptions ? (I doubt it because branch conditions are simply forgotten when merging value sets but you may have ideas ?) Here is the model obtained after applying function and loop contracts is the following: #include <assert.h>
#include <stdbool.h>
#include <stdlib.h>
bool nondet_bool();
int main()
{
size_t n;
__CPROVER_assume(n > 1);
__CPROVER_assume(n < __CPROVER_max_malloc_size);
char *src;
char *dst;
#if NULL_INIT
src = NULL;
dst = NULL;
#endif
// encoding of assume(__CPROVER_is_fresh(src, n))
#if NONDET_ALLOC
// nondet allocation followed by assumptions
bool allocate_src = nondet_bool();
#else
bool allocate_src = true;
#endif
if(allocate_src)
{
src = malloc(n);
__CPROVER_assume(src);
}
__CPROVER_assume(allocate_src);
// encoding of assume(__CPROVER_is_fresh(dst, n))
#if NONDET_ALLOC
bool allocate_dst = nondet_bool();
#else
bool allocate_dst = true;
#endif
if(allocate_dst)
{
dst = malloc(n);
__CPROVER_assume(dst);
}
__CPROVER_assume(allocate_dst);
// encoding of inductive check for loop contract
// loop counter
size_t i = 0;
// assert loop invariant base case
assert(i <= n);
assert(__CPROVER_forall {
size_t j;
j<i ==> dst[j] == src[j]
});
// havoc lop step
__CPROVER_havoc_object(&i);
__CPROVER_havoc_object(dst);
// assume loop inv
__CPROVER_assume(i <= n);
__CPROVER_assume(__CPROVER_forall {
size_t j;
j<i ==> dst[j] == src[j]
});
// loop step
if(i < n)
{
dst[i] = src[i];
i++;
// assert loop invariant step
assert(i <= n);
assert(__CPROVER_forall {
size_t j;
j<i ==> dst[j] == src[j]
});
__CPROVER_assume(false);
}
// assert post condition : dst is equal to src
assert(__CPROVER_forall {
size_t j;
j<n ==> dst[j] == src[j]
});
return 0;
} |
f1c90c7
to
59f706c
Compare
UPDATE: When we enforce a requires clause or replace an ensures clause that assumes some predicate holds on a pointer, we use the following code pattern:
The havoc operation adds all pointers known to symex to the value set of the pointer (heap allocated objects, address-taken local objects, etc.). This can make the value set extremely large. I don't think there's an easy way to prune the write set using the assumption without invoking a SAT solver, since symex does not statically track the relation bewtween branch conditions and value set contents. The only thing I've tried which seems to mitigate the issue is to make pointers INVALID (i.e.: empty value set, nondet bit pattern) when pointer predicates evaluate to false in an assumption context. The resulting pointers cannot be dereferenced and equality with other pointers can always be refuted. This means that no useful pointer can be gained from assuming that a pointer predicate is false. @qinheping @tautschnig would that restore both performance and preserve soudness ? If a contract has a bad precondition where in some case no pointer predicate holds for some pointer, the function would not be able to use said pointer without triggering INVALID pointer use error. |
UPDATE: // 78 file main-manual.c line 71 function main with NULL init
(165) a2!0@1#2 == ((forall {
__CPROVER_size_t j!0@12#0;
j!0@12#0 >= i!0@1#4 ||
LET derefd_pointer$5!0#0=dst!0@1#4 + (signed long int)j!0@12#0 IN
__CPROVER_POINTER_OBJECT(derefd_pointer$5!0#0) == __CPROVER_POINTER_OBJECT(&dynamic_object$0) ?
(char)dynamic_object$0#5[(signed long int)__CPROVER_POINTER_OFFSET(derefd_pointer$5!0#0)] :
//// HERE /////
invalid_object$6!0#0
///////////////
==
LET derefd_pointer$6!0#0=src!0@1#4 + (signed long int)j!0@12#0 IN
__CPROVER_POINTER_OBJECT(derefd_pointer$6!0#0) == __CPROVER_POINTER_OBJECT(&dynamic_object) ?
(char)dynamic_object#3[(signed long int)__CPROVER_POINTER_OFFSET(derefd_pointer$6!0#0)] :
/// HERE ///
invalid_object$7!0#0
////
}) ? TRUE : FALSE)
// 78 file main-manual.c line 71 function main with invalid init
(167) a2!0@1#2 == ((forall {
__CPROVER_size_t j!0@12#0;
j!0@12#0 >= i!0@1#4 ||
LET derefd_pointer$5!0#0=dst!0@1#4 + (signed long int)j!0@12#0 IN
__CPROVER_POINTER_OBJECT(derefd_pointer$5!0#0) == __CPROVER_POINTER_OBJECT(&dynamic_object$0) ?
(char)dynamic_object$0#5[(signed long int)__CPROVER_POINTER_OFFSET(derefd_pointer$5!0#0)] :
//// HERE /////
(__CPROVER_POINTER_OBJECT(derefd_pointer$5!0#0) == __CPROVER_POINTER_OBJECT(&invalid$object) ?
byte_extract_little_endian(invalid$object#2, __CPROVER_POINTER_OFFSET(derefd_pointer$5!0#0), char) :
invalid_object$6!0#0)
///////////////
==
LET derefd_pointer$6!0#0=src!0@1#4 + (signed long int)j!0@12#0 IN
__CPROVER_POINTER_OBJECT(derefd_pointer$6!0#0) == __CPROVER_POINTER_OBJECT(&dynamic_object) ?
(char)dynamic_object#3[(signed long int)__CPROVER_POINTER_OFFSET(derefd_pointer$6!0#0)] :
/// HERE ///
(__CPROVER_POINTER_OBJECT(derefd_pointer$6!0#0) == __CPROVER_POINTER_OBJECT(&invalid$object) ?
byte_extract_little_endian(invalid$object#2, __CPROVER_POINTER_OFFSET(derefd_pointer$6!0#0), char) :
invalid_object$7!0#0)
//////
}) ? TRUE : FALSE) for instance instead of getting (__CPROVER_POINTER_OBJECT(derefd_pointer$6!0#0) == __CPROVER_POINTER_OBJECT(&invalid$object) ?
byte_extract_little_endian(invalid$object#2, __CPROVER_POINTER_OFFSET(derefd_pointer$6!0#0), char) :
invalid_object$7!0#0) It seems like z3 does not really realize that |
b853e05
to
540f3b8
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #8562 +/- ##
===========================================
- Coverage 78.95% 78.45% -0.51%
===========================================
Files 1730 1732 +2
Lines 198759 200076 +1317
Branches 18314 18407 +93
===========================================
+ Hits 156938 156974 +36
- Misses 41821 43102 +1281 ☔ View full report in Codecov by Sentry. |
c22a416
to
958e5ab
Compare
converting to draft, more changes coming |
The name |
Operates in two modes, configurable via new CLI switch `--dfcc-simple-invalid-pointer-model`
…g invalid pointers.
e98ffb3
to
8fc0f1d
Compare
assume
contextsassume
contexts
Ready for review, I decomposed the PR into basic commits that can be reviewed 1 by 1. |
Many thanks for the detailed analysis and work on this issue. |
Meant to replace `p == q in contract clauses, works as a hook to override equality behaviour in assume contexts to make it constructive.
Must be done in : - requires clause expression - ensures clause expression - user-defined pointer predicates
ea9ede3
to
d0642ee
Compare
Thanks for finding and reporting the problem |
The test times out with old versions ofz3 so we avoid running it while waiting for bitwuzla in CI
One action vs-2022 action is still failing due to error 429 https://github.com/diffblue/cbmc/actions/runs/12834710600/job/35803987071?pr=8562
|
The problem
The soundness problem was reported in issue #8560.
When used in an assumption context (i.e. requires clause in contract checking mode, or ensures clause in contract replacement mode), the predicate always succeeds and allocates a fresh object, which can mask other cases of a disjunction.
This function takes a pointer that can be either valid, or invalid:
The precondition uses a disjunction to specify both cases:
is_fresh(p,size)
always succeeds in the left disjunct and short-circuits the right disjunct.assert(false)
in the body of foo is unreachable, whereas it would be reachable in a real execution where the left disjunct does not hold.Summary of the fixes
The fix restores soundness by ensuring that both true and false outcomes are always possible for all basic memory predicates and their combinations under short cutting operators ==>, && , ||.
We also introduce a new predicate
__CPROVER_pointer_equals(p, q)
as a replacement forp == q
for pointer equality.It serves as a hook to let us override the behaviour of pointer equality in assume contexts to make it constructive (ie. assign right hand side to left hand side).
With each basic predicate, the
false
outcome yields an invalid pointer with an empty value set and a nondeterministic bit pattern. This ensures that:Last, when replacing a function call by its contract abstraction, instead of making pointers fully nondeterministic like we used to do, we now make them invalid. We rely on the ensures clause evaluated in "assume" mode, to construct a null or valid pointer satisfying the post condition.
This helps solving a major performance degradation due to an explosion of the value set with the built-in pointer havoc of CBMC (see next section).
Performance impact of the fix
Reintroducing nondeterminism impacts performance negatively in some cases (specially with quantifiers).
The invalid pointers introduced by
false
exit path for pointer predicates enter the value set of the pointers, and remain there even if thetrue
exit path is forced via an assume (the assumption enters the path condition but does not prune the pointers value sets, which is what CBMC uses to resolve dereference operations).This results in pointer dereference operations being distributed over both valid and invalid objects, even if the invalid objects are not reachable under the local path condition. This yields very large case split expressions. In addition, since pointer offsets for invalid pointers are non-deterministic, the expressions may also contain a large number (from dozens to hundreds) of sdiv/smod/srem/byte_extract/byte_concat operators to handle misalinged accesses that overlap different array cells, struct members, padding bytes, etc.
As a result
z3
is not able to solve these problems, specially when quantifiers are involved.To mitigate the performance regression with z3, I added a switch
--dfcc-simple-invalid-pointer-model
that eliminates offset non-determinism for invalid pointers generated in failure paths of pointer predicates.When the switch is activated, an invalid pointer is nondeterministically NULL or pointing to a unique dummy object of size 0 that is declared dead/deallocated. This mode is tagged as "unsound" in the CLI documentation and MAN pages due to the reduced non-determinism. Still, I think there is enough non-determinism to ensure that any attempt to use such a pointer by the user-program would result in an analysis failure.
The good news is, with this switch activated, z3 > v4.12 seems to be able to solve problems involving quantifiers more easily, and
bitwuzla
>v0.6.0 appears able to solve these problems without using--dfcc-simple-invalid-pointer-model
.Since our CI runs with old versions of
z3
(as old as v4.8.7 from 2019), and does not runbitwuzla
yet, I turned 1CORE
testcontracts-dfcc/quantifiers-loops-fresh-bound-vars-smt/test.desc
intoFUTURE
, while waiting for better performance fixes (see last commit of the PR).Why did this go unnoticed ?
Process improvements
Besides design and code reviews, I think the problem could have be found if we had looked at the path coverage achieved by our test suite on the instrumentation code found in
cprover_contracts.c
. This code is only symbolically executed when analyzing a model, and at the moment we don't collect reachability/coverage metrics for that type of code under regression testing. If we had instrumented branch condition coverage we could have realized early on that no tests covered the false branch in our __CPROVER_is_fresh predicate implementation.