Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CONTRACTS: allow pointer predicates to fail in assume contexts #8562

Open
wants to merge 14 commits into
base: develop
Choose a base branch
from

Conversation

remi-delmas-3000
Copy link
Collaborator

@remi-delmas-3000 remi-delmas-3000 commented Jan 10, 2025

The problem

The soundness problem was reported in issue #8560.

When used in an assumption context (i.e. requires clause in contract checking mode, or ensures clause in contract replacement mode), the predicate always succeeds and allocates a fresh object, which can mask other cases of a disjunction.

This function takes a pointer that can be either valid, or invalid:

void foo(char *p)
requires(__is_fresh(p, size) || true)
assigns(p!=null: __object_from(p))
{
   if (!p) {
     assert(false); // unreachable
   }
}

The precondition uses a disjunction to specify both cases:

  • The predicate is_fresh(p,size) always succeeds in the left disjunct and short-circuits the right disjunct.
  • The assert(false) in the body of foo is unreachable, whereas it would be reachable in a real execution where the left disjunct does not hold.
  • This is a soundness issue because the set of states described by the precondition was (unintentionally) under-approximated.

Summary of the fixes

The fix restores soundness by ensuring that both true and false outcomes are always possible for all basic memory predicates and their combinations under short cutting operators ==>, && , ||.

We also introduce a new predicate __CPROVER_pointer_equals(p, q) as a replacement for p == q for pointer equality.
It serves as a hook to let us override the behaviour of pointer equality in assume contexts to make it constructive (ie. assign right hand side to left hand side).

With each basic predicate, the false outcome yields an invalid pointer with an empty value set and a nondeterministic bit pattern. This ensures that:

  • In outcomes where some pointer predicates are positively assumed, the target pointers are by construction either null or valid pointers.
  • when no pointer predicate is positively enforced via an requires clause under --enforce-contract (or via an ensures clause under --replace-call-with-contract, downstream GOTO instructions cannot use the pointer (checking an assertion, dereferencing, performing offset arithmetic) without triggering a verification error.

Last, when replacing a function call by its contract abstraction, instead of making pointers fully nondeterministic like we used to do, we now make them invalid. We rely on the ensures clause evaluated in "assume" mode, to construct a null or valid pointer satisfying the post condition.

This helps solving a major performance degradation due to an explosion of the value set with the built-in pointer havoc of CBMC (see next section).

Performance impact of the fix

Reintroducing nondeterminism impacts performance negatively in some cases (specially with quantifiers).

The invalid pointers introduced by false exit path for pointer predicates enter the value set of the pointers, and remain there even if the true exit path is forced via an assume (the assumption enters the path condition but does not prune the pointers value sets, which is what CBMC uses to resolve dereference operations).

This results in pointer dereference operations being distributed over both valid and invalid objects, even if the invalid objects are not reachable under the local path condition. This yields very large case split expressions. In addition, since pointer offsets for invalid pointers are non-deterministic, the expressions may also contain a large number (from dozens to hundreds) of sdiv/smod/srem/byte_extract/byte_concat operators to handle misalinged accesses that overlap different array cells, struct members, padding bytes, etc.

As a result z3 is not able to solve these problems, specially when quantifiers are involved.

To mitigate the performance regression with z3, I added a switch --dfcc-simple-invalid-pointer-model that eliminates offset non-determinism for invalid pointers generated in failure paths of pointer predicates.
When the switch is activated, an invalid pointer is nondeterministically NULL or pointing to a unique dummy object of size 0 that is declared dead/deallocated. This mode is tagged as "unsound" in the CLI documentation and MAN pages due to the reduced non-determinism. Still, I think there is enough non-determinism to ensure that any attempt to use such a pointer by the user-program would result in an analysis failure.

The good news is, with this switch activated, z3 > v4.12 seems to be able to solve problems involving quantifiers more easily, and bitwuzla >v0.6.0 appears able to solve these problems without using --dfcc-simple-invalid-pointer-model.

Since our CI runs with old versions of z3 (as old as v4.8.7 from 2019), and does not run bitwuzla yet, I turned 1 CORE test contracts-dfcc/quantifiers-loops-fresh-bound-vars-smt/test.desc into FUTURE, while waiting for better performance fixes (see last commit of the PR).

Why did this go unnoticed ?

  • In the initial implementation of is_fresh both true and false cases were reachable in assume contexts (when malloc failure modes were active);
  • Our contracts-dfcc regression test suite did not contain tests for disjunctions or the “return false” case.
  • We did not notice our test suite was missing this behaviour, because:
    • The predicate is implemented in the CPROVER library, as C code
    • The CPROVER library is only symbolically executed during proofs
    • We don’t collect symbolic reachability/coverage metrics for the CPROVER library
    • hence, we could not detect that parts of the predicate implementation (the path that returns false) were not exercised at all by the contracts-dfcc regression suite.
  • Up to this point, the predicate behaviour was still sound, but we had an incomplete contracts-dfcc test suite
  • Later, the predicate implementation was modified to encode “allocation size too large" failure modes
    • 3a1a5ac"
    • This commit inadvertently removed the code path that returns false in assume contexts
    • The bug was not caught by regression suite (for reasons mentioned above).

Process improvements

Besides design and code reviews, I think the problem could have be found if we had looked at the path coverage achieved by our test suite on the instrumentation code found in cprover_contracts.c. This code is only symbolically executed when analyzing a model, and at the moment we don't collect reachability/coverage metrics for that type of code under regression testing. If we had instrumented branch condition coverage we could have realized early on that no tests covered the false branch in our __CPROVER_is_fresh predicate implementation.

  • Each commit message has a non-empty body, explaining why the change was made.
  • Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
  • The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
  • Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
  • My commit message includes data points confirming performance improvements (if claimed).
  • My PR is restricted to a single feature or bugfix.
  • White-space or formatting changes outside the feature-related changed lines are in commits of their own.

@remi-delmas-3000
Copy link
Collaborator Author

@tautschnig @qinheping I added more regression tests to cover contract replacement mode and the pointer_in_range_dfcc predicate. This one still worked as expected.

I think one of the limitations of our regression testing process that allowed this bug to go unnoticed for so long is that we do not collect coverage metrics for the CPROVER library code. The code we add in the CPROVER library is only symbolically executed by CBMC when analyzing a GOTO model and we do not collect rechability/coverage metrics for that type of instrumentation code.

@remi-delmas-3000
Copy link
Collaborator Author

The non-determinism causes a big performance regression on memory-predicates-user-defined-ensures-replace/main.c, memory blowup (100Gb+).

@remi-delmas-3000
Copy link
Collaborator Author

remi-delmas-3000 commented Jan 12, 2025

quantifiers-loops-fresh-bound-vars-smt/test.desc is also surprisingly long:

I've narrowed the performance regression to this.

We start from this function contract and loop contract for a memcpy function :

void foo(char *dst, const char *src, size_t n)
  // clang-format off
  __CPROVER_requires(__CPROVER_is_fresh(src, n))
  __CPROVER_requires(__CPROVER_is_fresh(dst, n))
  __CPROVER_assigns(__CPROVER_object_from(dst))
  __CPROVER_ensures(__CPROVER_forall {
    size_t j;
    j < n ==> dst[j] == src[j]
  })
// clang-format on
{
  for(size_t i = 0; i < n; i++)
    // clang-format off
    __CPROVER_assigns(i, __CPROVER_object_from(dst))
    __CPROVER_loop_invariant(i <= n)
    __CPROVER_loop_invariant(
      __CPROVER_forall { size_t j; j < i ==> dst[j] == src[j] })
    // clang-format on
    {
      dst[i] = src[i];
    }
}

With the changes introduced in this PR assume(__CPROVER_is_fresh(src, n)) is encoded as:

  char *src;
  bool allocate_src = nondet_bool();
  if(allocate_src)
  {
    src = malloc(n);
    __CPROVER_assume(src);
  }
  __CPROVER_assume(allocate_src);
  // value_set(src) = {INVALID, &some_object} with invalid unreachable

Whereas before it would always succeed and be encoded as

  char *src;
  bool allocate_src = true;
  if(allocate_src)
  {
    src = malloc(n);
    __CPROVER_assume(src);
  }
  __CPROVER_assume(allocate_src);
  // value_set(src) = {&some_object}

With nondeterminism + assumption the runtime blows up, with the deterministic allocation, the model is problem is solved instantly.

I think the problem might be that in the non-deterministic case, the value set for src and dst pointers contains both an invalid pointer and a valid pointer, and the invalid pointer does not get pruned away when enforcing the assumptions
__CPROVER_assume(allocate_src) and __CPROVER_assume(allocate_dst), so that whenever the program attempts to load/store to src/dst the case where the pointers are invalid also get symbolically executed, and the SMT solver has to learn a lot of conflict clauses to prune these cases away while solving (instead of statically during symex).

Interestingly, if I default-initialize src and dst to NULL and then perform nondet-allocation + assume. In that case the value set for the pointers src and dst contain NULL or a valid pointer and the problem gets solved in a few seconds too.

  char *src = NULL;
  bool allocate_src = true;
  if(allocate_src)
  {
    src = malloc(n);
    __CPROVER_assume(src);
  }
  __CPROVER_assume(allocate_src);
// value_set(src) = {NULL, &some_object} with NULL unreachable

@tautschnig is there any way we could prune value sets for pointers based on assumptions ? (I doubt it because branch conditions are simply forgotten when merging value sets but you may have ideas ?)

Here is the model obtained after applying function and loop contracts is the following:

#include <assert.h>
#include <stdbool.h>
#include <stdlib.h>
bool nondet_bool();
int main()
{
  size_t n;
  __CPROVER_assume(n > 1);
  __CPROVER_assume(n < __CPROVER_max_malloc_size);
  char *src;
  char *dst;

#if NULL_INIT
  src = NULL;
  dst = NULL;
#endif

  // encoding of assume(__CPROVER_is_fresh(src, n))
#if NONDET_ALLOC
  // nondet allocation followed by assumptions
  bool allocate_src = nondet_bool();
#else
  bool allocate_src = true;
#endif

  if(allocate_src)
  {
    src = malloc(n);
    __CPROVER_assume(src);
  }
  __CPROVER_assume(allocate_src);

  // encoding of assume(__CPROVER_is_fresh(dst, n))
#if NONDET_ALLOC
  bool allocate_dst = nondet_bool();
#else
  bool allocate_dst = true;
#endif
  if(allocate_dst)
  {
    dst = malloc(n);
    __CPROVER_assume(dst);
  }
  __CPROVER_assume(allocate_dst);

  // encoding of inductive check for loop contract
  // loop counter
  size_t i = 0;

  // assert loop invariant base case
  assert(i <= n);
  assert(__CPROVER_forall {
    size_t j;
    j<i ==> dst[j] == src[j]
  });

  // havoc lop step
  __CPROVER_havoc_object(&i);
  __CPROVER_havoc_object(dst);

  // assume loop inv
  __CPROVER_assume(i <= n);
  __CPROVER_assume(__CPROVER_forall {
    size_t j;
    j<i ==> dst[j] == src[j]
  });

  // loop step
  if(i < n)
  {
    dst[i] = src[i];
    i++;
    // assert loop invariant step
    assert(i <= n);
    assert(__CPROVER_forall {
      size_t j;
      j<i ==> dst[j] == src[j]
    });
    __CPROVER_assume(false);
  }

  // assert post condition : dst is equal to src
  assert(__CPROVER_forall {
    size_t j;
    j<n ==> dst[j] == src[j]
  });
  return 0;
}

@remi-delmas-3000 remi-delmas-3000 force-pushed the contracts-allow-is-fresh-to-fail branch from f1c90c7 to 59f706c Compare January 13, 2025 22:40
@remi-delmas-3000
Copy link
Collaborator Author

remi-delmas-3000 commented Jan 13, 2025

UPDATE:
I confirmed today that the peformance regressions were due to the interplay between pointer havocing and pointer predicates from contracts.

When we enforce a requires clause or replace an ensures clause that assumes some predicate holds on a pointer, we use the following code pattern:

 // havoc the pointer
ptr = nondet_ptr();
// constructs a nondeterministic state 
bool cond = __CPROVER_contracts_is_fresh(&ptr, size, write_set); 
// forces the success path
assume(cond);

The havoc operation adds all pointers known to symex to the value set of the pointer (heap allocated objects, address-taken local objects, etc.). This can make the value set extremely large.
The is_fresh predicate nondeterministically assigns ptr to a fresh object in the success case, but leaves it untouched in the failure case so that the value set remains very large.
The assumption semantically forces the success case for the predicate, but the value set after the assumption remains the same. When the pointer is later dereferenced, the dereference is distributed over the whole value set and it blows up.

I don't think there's an easy way to prune the write set using the assumption without invoking a SAT solver, since symex does not statically track the relation bewtween branch conditions and value set contents.

The only thing I've tried which seems to mitigate the issue is to make pointers INVALID (i.e.: empty value set, nondet bit pattern) when pointer predicates evaluate to false in an assumption context.

The resulting pointers cannot be dereferenced and equality with other pointers can always be refuted. This means that no useful pointer can be gained from assuming that a pointer predicate is false.

@qinheping @tautschnig would that restore both performance and preserve soudness ?

If a contract has a bad precondition where in some case no pointer predicate holds for some pointer, the function would not be able to use said pointer without triggering INVALID pointer use error.

@remi-delmas-3000
Copy link
Collaborator Author

remi-delmas-3000 commented Jan 14, 2025

UPDATE:
The cause for performance regression on quantifiers-loops-fresh-bound-vars-smt/test.desc when using invalid pointers to initialise in case is_fresh is false is due to these differences

// 78 file main-manual.c line 71 function main with NULL init
(165) a2!0@1#2 == ((forall {
  __CPROVER_size_t j!0@12#0;
    j!0@12#0 >= i!0@1#4 ||
      LET derefd_pointer$5!0#0=dst!0@1#4 + (signed long int)j!0@12#0 IN
        __CPROVER_POINTER_OBJECT(derefd_pointer$5!0#0) == __CPROVER_POINTER_OBJECT(&dynamic_object$0) ?
          (char)dynamic_object$0#5[(signed long int)__CPROVER_POINTER_OFFSET(derefd_pointer$5!0#0)] :
          //// HERE /////
          invalid_object$6!0#0
          ///////////////
          ==
        LET derefd_pointer$6!0#0=src!0@1#4 + (signed long int)j!0@12#0 IN
          __CPROVER_POINTER_OBJECT(derefd_pointer$6!0#0) == __CPROVER_POINTER_OBJECT(&dynamic_object) ?
          (char)dynamic_object#3[(signed long int)__CPROVER_POINTER_OFFSET(derefd_pointer$6!0#0)] :
          /// HERE ///
          invalid_object$7!0#0
          ////
    }) ? TRUE : FALSE)


// 78 file main-manual.c line 71 function main with invalid init
(167) a2!0@1#2 == ((forall {
  __CPROVER_size_t j!0@12#0;
    j!0@12#0 >= i!0@1#4 ||
      LET derefd_pointer$5!0#0=dst!0@1#4 + (signed long int)j!0@12#0 IN
        __CPROVER_POINTER_OBJECT(derefd_pointer$5!0#0) == __CPROVER_POINTER_OBJECT(&dynamic_object$0) ?
          (char)dynamic_object$0#5[(signed long int)__CPROVER_POINTER_OFFSET(derefd_pointer$5!0#0)] :
          //// HERE /////
          (__CPROVER_POINTER_OBJECT(derefd_pointer$5!0#0) == __CPROVER_POINTER_OBJECT(&invalid$object) ?
            byte_extract_little_endian(invalid$object#2, __CPROVER_POINTER_OFFSET(derefd_pointer$5!0#0), char) :
            invalid_object$6!0#0)
          ///////////////
        ==
      LET derefd_pointer$6!0#0=src!0@1#4 + (signed long int)j!0@12#0 IN
        __CPROVER_POINTER_OBJECT(derefd_pointer$6!0#0) == __CPROVER_POINTER_OBJECT(&dynamic_object) ?
          (char)dynamic_object#3[(signed long int)__CPROVER_POINTER_OFFSET(derefd_pointer$6!0#0)] :
          /// HERE ///
          (__CPROVER_POINTER_OBJECT(derefd_pointer$6!0#0) == __CPROVER_POINTER_OBJECT(&invalid$object) ?
            byte_extract_little_endian(invalid$object#2, __CPROVER_POINTER_OFFSET(derefd_pointer$6!0#0), char) :
            invalid_object$7!0#0) 
            //////
      }) ? TRUE : FALSE)

for instance instead of getting invalid_object$7!0#0 when dereferencing a fresh pointer we get the more complex expression:

 (__CPROVER_POINTER_OBJECT(derefd_pointer$6!0#0) == __CPROVER_POINTER_OBJECT(&invalid$object) ?
byte_extract_little_endian(invalid$object#2, __CPROVER_POINTER_OFFSET(derefd_pointer$6!0#0), char) :
invalid_object$7!0#0)

It seems like z3 does not really realize that
(__CPROVER_POINTER_OBJECT(derefd_pointer$6!0#0) == __CPROVER_POINTER_OBJECT(&invalid$object) simplifies to false under the assumption that __CPROVER_contracts_is_fresh succeeds, and blows up trying to solve that quantified formula.

@remi-delmas-3000 remi-delmas-3000 force-pushed the contracts-allow-is-fresh-to-fail branch from b853e05 to 540f3b8 Compare January 14, 2025 17:10
Copy link

codecov bot commented Jan 14, 2025

Codecov Report

Attention: Patch coverage is 50.45045% with 55 lines in your changes missing coverage. Please review.

Project coverage is 78.45%. Comparing base (97c8624) to head (0b1231c).

Files with missing lines Patch % Lines
...t/contracts/dynamic-frames/dfcc_pointer_equals.cpp 43.47% 26 Missing ⚠️
...t/contracts/dynamic-frames/dfcc_spec_functions.cpp 60.71% 11 Missing ⚠️
src/util/config.cpp 44.44% 5 Missing ⚠️
src/ansi-c/c_typecheck_expr.cpp 55.55% 4 Missing ⚠️
src/ansi-c/cprover_library.cpp 40.00% 3 Missing ⚠️
...cts/dynamic-frames/dfcc_lift_memory_predicates.cpp 57.14% 3 Missing ⚠️
...ument/contracts/dynamic-frames/dfcc_instrument.cpp 66.66% 1 Missing ⚠️
...ent/contracts/dynamic-frames/dfcc_spec_functions.h 0.00% 1 Missing ⚠️
.../contracts/dynamic-frames/dfcc_wrapper_program.cpp 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #8562      +/-   ##
===========================================
- Coverage    78.95%   78.45%   -0.51%     
===========================================
  Files         1730     1732       +2     
  Lines       198759   200076    +1317     
  Branches     18314    18407      +93     
===========================================
+ Hits        156938   156974      +36     
- Misses       41821    43102    +1281     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@remi-delmas-3000 remi-delmas-3000 force-pushed the contracts-allow-is-fresh-to-fail branch 2 times, most recently from c22a416 to 958e5ab Compare January 14, 2025 23:37
@remi-delmas-3000 remi-delmas-3000 marked this pull request as draft January 16, 2025 22:28
@remi-delmas-3000
Copy link
Collaborator Author

converting to draft, more changes coming

@kroening
Copy link
Member

The name pointer_equals and the proposed semantics seem too far apart for me.

@remi-delmas-3000 remi-delmas-3000 force-pushed the contracts-allow-is-fresh-to-fail branch from e98ffb3 to 8fc0f1d Compare January 17, 2025 08:01
@remi-delmas-3000 remi-delmas-3000 changed the title CONTRACTS: allow is_fresh to fail in assume contexts CONTRACTS: allow pointer predicates to fail in assume contexts Jan 17, 2025
@remi-delmas-3000 remi-delmas-3000 marked this pull request as ready for review January 17, 2025 08:22
@remi-delmas-3000
Copy link
Collaborator Author

Ready for review, I decomposed the PR into basic commits that can be reviewed 1 by 1.

@rod-chapman
Copy link
Collaborator

Many thanks for the detailed analysis and work on this issue.

Remi Delmas added 5 commits January 17, 2025 12:08
Meant to replace `p == q in contract clauses, works as a hook
to override equality behaviour in assume contexts to make it
constructive.
Must be done in :
- requires clause expression
- ensures clause expression
- user-defined pointer predicates
@remi-delmas-3000 remi-delmas-3000 force-pushed the contracts-allow-is-fresh-to-fail branch from ea9ede3 to d0642ee Compare January 17, 2025 17:13
@remi-delmas-3000
Copy link
Collaborator Author

Many thanks for the detailed analysis and work on this issue.

Thanks for finding and reporting the problem

Remi Delmas added 3 commits January 17, 2025 13:17
The test times out with old versions ofz3
so we avoid running it while waiting for
bitwuzla in CI
@remi-delmas-3000
Copy link
Collaborator Author

One action vs-2022 action is still failing due to error 429

https://github.com/diffblue/cbmc/actions/runs/12834710600/job/35803987071?pr=8562

--2025-01-17 23:25:18--  https://git.savannah.gnu.org/cgit/parallel.git/plain/src/parallel
Resolving git.savannah.gnu.org (git.savannah.gnu.org)... 209.51.188.78
Connecting to git.savannah.gnu.org (git.savannah.gnu.org)|209.51.188.78|:443... connected.
HTTP request sent, awaiting response... 429 Too Many Requests
2025-01-17 23:25:18 ERROR 429: Too Many Requests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants