-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about checking linearizability of a history with a custom model #26
Comments
Having two definitions for createReentrantMutex looks a little unusual to me--I think normally this would be (defn createReentrantMutex ([] ...) ([arg1 arg2 arg3] ...)), but if it compiles and you're not getting arity exceptions, maybe it works! This miiiight be due to an infinite state space interacting poorly with model memoization code that runs before the analysis begins; we normally test with finite, compact spaces, and this one is countably infinite, so that miiiight be what's going on. Can you get a thread dump of the process using jstack or similar? |
Thanks for the response. I know it is not idiomatic clojure :) I am a clojure newbie. The code above works fine. The very first model objects are created with the Btw, I also run this test for 10 secs, only with a few acquire and releases, but still no luck. I will get back to you with a thread dump. |
Here is a thread dump for another run of the same test. |
Yep, you're being bit by the memorization phase, which is designed for finite state spaces--but you've got an infinite state space with this model. You're in luck, I actually have some time today for the first time in months, and I know how to fix this, so I'll take a shot at it later on :)On Aug 17, 2018 05:31, Ensar Basri Kahveci <[email protected]> wrote:Here is a thread dump for another run of the same test.
dump2.txt
—You are receiving this because you commented.Reply to this email directly, view it on GitHub, or mute the thread.
|
that would be very helpful. thanks a lot. looking forward to get the fix. |
I have been cleaning up my code to give you a reproducer. You can run the test via https://github.com/metanet/jepsen/tree/custom-model if you like. |
Right, I've released 0.3.3-SNAPSHOT which aborts memoization for large state spaces. #25, you may also find this of use! Would y'all like to give that a shot and see if it helps? |
thanks a lot. I will re-run it and ping you back. |
I updated the knossos dependency of jepsen to 0.3.4-SNAPSHOT, re-installed jepsen, and run the test, but still encountering the same issue :( |
ok, I have some good news. your explanation about the search space helped me to simplify my model a little bit. My goal is to test reentrancy, such that a client acquires the lock 2 times, and releases it. However, my model turned out to be too loose for this test. So I made the following simple change to reduce the search space and now the analysis step terminates successfully. Thanks for your help.
|
Hi there,
I am not sure if I should open this issue to this repo or the jepsen repo. I am giving a shot here and can port the issue to the jepsen repo if requested.
I have a very simple jepsen test taking 20 seconds and injecting no faults in the environment. It is a mutex test for now, but I will use it for testing re-entrant mutex in future once I make this version work.
In my test, i have 5 clients. Each client is acquiring and releasing the same distributed lock object repeatedly.
Here is my generator:
Here is my custom model:
My model is the
ReentrantMutex
defrecord. Initially, it isReentrantMutex(nil, 0)
, means nobody holding the lock. For instance, ifn1
acquires the lock, it will beReentrantMutex("n1", 1)
.When I run this test for 20 seconds without introducing any failure in the system, I get a history of operations.
I am sharing my history files and the timeline file here.
history.txt
history.edn.txt
timeline.html.txt
Sorry for the file extensions. Please rename "history.edn.txt" to "history.edn" and "timeline.html.txt" to "timeline.html.
In short, the history of operations is this one:
N3_ACQ => N3_REL => N2_ACQ => N2_REL => N4_ACQ => N4_REL => N1_ACQ => N1_REL => N5_ACQ => N5_REL => N3_ACQ => N3_REL => N2_ACQ => N2_REL => N4_ACQ => N4_REL => N1_ACQ => N1_REL => N5_ACQ => N5_REL => N3_ACQ => N3_REL => N2_ACQ => N2_REL => N4_ACQ
So the lock is held by
N3, N2, N4, N1, N5, N3, N2, N4, N1, N5, N3, N2, N4
in order. You can verify this by checking the history files and the timeline file.The thing is, when the analyzer starts running with my model, it never terminates (when info logs above are commented out). I cannot say that I fully understand how it works internally but I see that it is walking through possible histories based on concurrency of operations, but each one failing with one of the
knossos.model/inconsistent
returns in my model. Moreover, as I run it with logs printed, I see states likeReentrantMutex("n2", 555)
. Probably, I am misunderstanding something, but I don't expect to see such states because acquire and release operations of the same client are sequential. Since I have a short history, I expect the analyzer to complete the verification step in short time. Therefore, I suspect I am doing something wrong in my model, or using the tool in some wrong way.I have tried many things, but couldn't manage to make this model work. Could you help me with this issue? Any hint or a pointer would be very helpful.
Thanks in advance.
The text was updated successfully, but these errors were encountered: