Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minimize unnecessary casts and check for overflows in witness invariants #1665

Open
wants to merge 11 commits into
base: apron-invariant-simplify
Choose a base branch
from

Conversation

karoliineh
Copy link
Member

@karoliineh karoliineh commented Jan 27, 2025

Removing unnecessary casts

  • Does not add the LL suffix for constant values that fit the int range: 1LL1
  • If a variable does not overflow the int range, it is not cast: (long long )hh
  • If an arithmetic operation does not overflow the int range, its subexpressions are not cast: (long long )j + 1LL j + 1

Dealing with overflows

  • If an arithmetic operation would overflow the int range but does not overflow in long long, only one of the variables or constants is cast: ((long long )k + (long long )d) + 2147483647LL((long long )k + d) + 2147483647
  • If an expression would overflow the long long range, it is discarded with a warning

TODO

  • n != (unsigned char)0 → n != 0
  • (unsigned long )arg == 0UL → arg == 0
  • Make sure that unsigned types are also checked for overflows within invariants
  • Fix issue revealed by Add abstract interface to Invariant #1668 before merging this.

I also observed that using the relational domain for overflow checks would be beneficial for eliminating some of the unnecessary casts. I.e. in 36-apron/52-queuesize.c, we can successfully verify that assertion __goblint_check(used + free == capacity) succeeds. However, as the overflow query only queries IntDomain values, the addition used+free will overflow the int range and is thus cast in the invariant. Furthermore, making the overflow queries use the information from relationAnalysis could also increase our precision. However, this should be addressed in a separate PR.

@karoliineh karoliineh self-assigned this Jan 27, 2025
@karoliineh karoliineh added sv-comp SV-COMP (analyses, results), witnesses relational Relational analyses (Apron, affeq, lin2var) explainability labels Jan 27, 2025
@karoliineh karoliineh added this to the SV-COMP 2026 milestone Jan 27, 2025
@michael-schwarz
Copy link
Member

michael-schwarz commented Jan 28, 2025

Nice, I think it's good to have better readable assertions! However, I'm a bit worried about one of these aspects:

If an arithmetic operation would overflow the int range but does not overflow in long long, only one of the variables or constants is cast.

This breaks invariants about the CIL AST, where all arguments to BinOp (except some Pointer + Integer ones) are supposed to have the same type as the resulting type. I don't think creating such illegal expressions is a good idea, especially given that @DrMichaelPetter's inter-domain refinement and similar features will likely also rely on the invariant feature, which also makes sense architecturally.

Could such logic not be moved to the pretty-printer where we know that the CIL expressions will not be fed to anything in Goblint before going through CIL parsing again? It seems printing one cast in general is enough to force the effective type to what we want it to be, regardless of overflow?

@sim642
Copy link
Member

sim642 commented Jan 28, 2025

This concern is already handled by constructing two exps in parallel: one without all the casts for the witness and another one with all the necessary implicit casts that's used for Goblint-internal things (here for the overflow queries).

It seems printing one cast in general is enough to force the effective type to what we want it to be, regardless of overflow?

That's not so simple in general. All the variables in an invariant may have different integer types. Sure, when cast to long long, everything else is probably bound to be integer promoted and arithmetic converted also to long long. But when trying to minimize casts, you can get many different intermediate types, again due to integer promotion and arithmetic conversion.

I don't think this could be done while pretty-printing because this relies on overflow queries. The secondary goal is to never return invariant expressions that we cannot guarantee overflow-free when going from mathematical integers from Apron to C.
This is probably also desirable for the inter-domain refinement expressions.

@michael-schwarz
Copy link
Member

Where is this handling? Currently it looks to be like we have modified the function cil_exp_of_lincons1 to yield invalid expressions that CIL would immediately flag as invalid if one were to call GoblintCil.Check.checkStandaloneExp. We should not abuse CIl types to represent illegal things --- this will just cause problems.

If you want such things I suggest stringifying them immediately so the invariant about Cil expressions is violated only locally in the function they originate from.

I don't think this could be done while pretty-printing because this relies on overflow queries.

Why? Either no cast is necessary and then the expression with all casts should also not contain any casts (because everything already is of the right type), or at least one of the arguments needs to be cast because the usual arithmetic promotions are not enough. In this case, CIL has both casts and your pretty printer could forgo the cast in one of the branches? What am I missing?

@karoliineh
Copy link
Member Author

karoliineh commented Jan 28, 2025

Either no cast is necessary and then the expression with all casts should also not contain any casts (because everything already is of the right type)

Casts are also unnecessary for invariants, where char-type variables are used in an arithmetic expression; however, for CIL, they are cast to int, and thus, they would not be the right shape for both. The problem is not only with casting to long, but also casting char types to int.

@sim642
Copy link
Member

sim642 commented Jan 28, 2025

Where is this handling? Currently it looks to be like we have modified the function cil_exp_of_lincons1 to yield invalid expressions that CIL would immediately flag as invalid if one were to call GoblintCil.Check.checkStandaloneExp. We should not abuse CIl types to represent illegal things --- this will just cause problems.

CIL expressions without implicit casts turned explicit are not invalid! CIL has the option insertImplicitCasts which we choose to enable, but CIL doesn't care and works fine without them as well. So from CIL's perspective, they're perfectly valid and would pass the check.

If you want such things I suggest stringifying them immediately so the invariant about Cil expressions is violated only locally in the function they originate from.

Of course they could cause problems if used in unintended contexts, but that's not the case. These CIL expressions (with or without implicit casts) are passed to Invariant.of_exp, where Invariant is a module for

(** Invariants for witnesses. *)

Witness invariants are intentionally wrapped into the Invariant.t type and cannot be accidentally used as normal CIL expressions.
So the cast-less CIL expressions do only exist in that local scope of the conversion and the rest of Goblint just sees an Invariant.t.

Right now there isn't invariant.mli that fully abstracts it (so it's possible to pattern match the expression out), but that could/should be added to prevent such misuse outright.

For non-witness use of such domains converted to expressions, there should be a separate query anyway (Queries.Invariant returns Invariant.t). Since the implementation currently constructs both copies of the expression, it's trivial to use one for the existing query for witnesses and the other for a separate query for refining expressions. It's likely that the two use cases might benefit from different expressions anyway (based on which domain is refining which). Meanwhile witness invariants are desirable to optimize differently, e.g. restrict to widened variables (#1219).

@karoliineh karoliineh marked this pull request as ready for review January 30, 2025 13:40
@sim642 sim642 mentioned this pull request Jan 31, 2025
2 tasks
@sim642
Copy link
Member

sim642 commented Jan 31, 2025

Right now there isn't invariant.mli that fully abstracts it (so it's possible to pattern match the expression out), but that could/should be added to prevent such misuse outright.

I now tried that in #1668 and that revealed a problem with precondition loop invariants instead.
The refinement stuff in #1635 already uses separate expression generation from invariant anyway and would not even be affected by this change.

@sim642 sim642 added the pr-dependency Depends or builds on another PR, which should be merged before label Jan 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
explainability pr-dependency Depends or builds on another PR, which should be merged before relational Relational analyses (Apron, affeq, lin2var) sv-comp SV-COMP (analyses, results), witnesses
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants