Bounded-scope generators #3014
konnov
started this conversation in
Ideas + Research
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Apalache currently supports value generators to restrict the scope of very large data structures.
Why we need value generators
When we want to find bugs in a protocol with a very large state space and plenty of behavior, e.g., Byzantine consensus, we would like to start model checking in a "reasonable" intermediate state (more on that later). Otherwise, both TLC and Apalache get stuck after a exploring the state space/execution space up to 10-20 steps.
The standard answer to this problem is to introduce an inductive invariant, prove its inductiveness and use it to start exploration in an arbitrary state that is described by the inductive invariant. Actually, with an inductive invariant, we need just need to explore one step. For instance, this is how I did it for Ben-Or's consensus recently, see IndInit (for the fixed constants):
Basically, there are two components in
IndInit
:TypeOK
restricts the scope of the state variables to the reasonable values.IndInv
captures the essential relations between the variables, specific to the protocol logic. This predicate is usually quite hard to come up with.This approach technically works both in TLC and Apalache. However, it's easy to see that
TypeOK
contains potentially very large sets on the right-hand side of the membership relation, e.g.:When we increase the cardinality of the sets
ALL
,ROUND
, andVALUES
, the setSUBSET [ src: ALL, r: ROUNDS, v: VALUES ]
grows exponentially faster than its base sets. Sure, it is the worst-case scenario. However, in many algorithms, processes need a very small subset of the messages to make the next step. Intuitively, we do not need an arbitrary subset of all possible messages, but we need an arbitrary subset of all possible messages up to a reasonable bound. Moreover, it's much easier to debug an inductive invariant on small subsets first and then check it on larger sets.The community modules have the operator kSubset to work around this problem. TLC has a custom implementation of
kSubset
. In the above example, we could replaceSUBSET
withkSubset
as follows:What is$k$ ? Well, it can be a small constant that helps us to obtain counterexamples fast. If the model checker does not give us a counterexample, we should increase $k$ , or convince ourselves that the current value of $k$ is sufficient to check our properties. Sometimes, after debugging the inductive invariant, we can even replace
kSubset
withSUBSET
again, when the invariant is tight enough for the model checker to work.While$O(n^2)$ for a set $S$ that potentially has $n$ elements. TLC does not have any issues with that, as it enumerates states, and it can precisely give us the set cardinality in every state.
kSubset
works in TLC, I do not know how to efficiently implement it in Apalache. Technically, we could enumerate all potential k-subsets of the base set and use them as the elements ofkSubset
. However, the definition ofkSubset
uses the set cardinality, and Apalache cannot statically compute the cardinality of an arbitrary set. The current encoding ofCardinality(S)
in Apalache producesTo have something similar to$k$ . For example, if the target type of $k$ elements, and all of its elements have the width up to $k$ . For example, if the set elements are sequences, their length must be bounded with $k$ .
kSubset
in Apalache, we have introduced the operatorApalache!Gen(k)
. This operator works in a very straightforward way. It produces completely unrestricted data structures of width up toGen
isSet(T)
, thenGen(k)
produces a symbolic set that has up toHaving
Apalache!Gen
, we can rewrite the above k-subset predicate as follows:Note that this version is not semantically equivalent to the one with$k$ . If we want a completely equivalent version, which we usually don't, we can add the constraint
kSubset
, asGen
can produce set of cardinalities from 0 toCardinality(A1) = k
.Why
Gen
is not the best solutionApalache!Gen
was introduced as a quick hack to experiment with the idea. While it technically works, there are several issues with it:Gen
is syntactically confusing. Semantically,Gen
acts as the existential quantifier\E A1 \in S: P
, but syntactically, it looks very much like a deterministic computation.Gen
relies on type inference to come up with the right shape of the data structure.Gen
uses the same boundAn alternative solution
Alternatively, we could use a special pattern of
\E x: P
to restrict the scope. We could rewrite the above example as:This version looks much more understandable. We just produce a completely unrestricted constant
A1
and further restrict it with a predicate.Implementation challenges
There are a few things that are not yet clear to me:
A1
in the above example? We would probably need a type annotation.\E A1
? We would need some form of static analysis to go over the predicate and compute the bounds. This problem is somewhat similar to computing the bounds ini..j
[FEATURE] Interval analysis to improve a..b #446.Beta Was this translation helpful? Give feedback.
All reactions