-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recheck slow when using choice combined with list #432
Comments
Can confirm that when I change the above generator to var chioce = Gen.Choice(new Collection<Gen<int>> { low, mid, big, large }).List(Range.LinearInt32(1,300)).Resize(150); I not only get a far better shrunken example but the recheck runs a lot more quickly. So this is really only an issue when the shrunken value would be expected to be large. |
In the new line of code, you changed Did you mean to make both changes? I was expecting to see one change. |
Sorry the 200 was meant to stay the same, it's the lower bound change (100 => 0 is intentional) combined with the Linear range which is important as it means we don't have such a large list when shrunk. |
A generator with a large sample space will typically take a long time to shrink. Is there a reason why you expect the shrinking for this one to be faster? |
Using the LinearRange is faster (even with the mistakenly pasted higher upper value), and I think it is faster because;
Visually you can tell less work is being done; here is the recheck from the original line //Recheck from var chioce = Gen.Choice(new Collection<Gen<int>> { low, mid, big, large }).List(Range.Constant(100,200));
prop.Recheck("0_8474167946941752373_7006309460388453401_000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000");
// Originally from 1 test and 150 odd shrinks As opposed to /// Recheck from var chioce = Gen.Choice(new Collection<Gen<int>> { low, mid, big, large }).List(Range.LinearInt32(1,300)).Resize(150);
prop.Recheck("0_13894515488252835539_14133286922830653955_00000000");
// Originally 1 test an 9 ish shrinks The link in the top comment takes you to a file with a better more realistic example, in the above example the speed difference moves from fast; to starting to be irritating. In the more realistic example the correct generator still takes about 8-10 seconds to re-check, where as using the constant list generator takes around 60+ seconds. For context the linked example is creating a list of commands to exercise a web interaction (a simple check out), I wanted the list to be large to start of with as I'm throwing lots of the commands away because at and point in the checkout process the generated command may be invalid. I realized once I posted the issue that the long shrunken list size was because I was using a constant range and changed to the linear range with large resize value so as to start of with large list. Doing this improved both the shrunken example and recheck speed. |
That is exactly as intended. I see no evidence of a bug and weak evidence of a performance issue. What behavior did you expect? |
You are correct in this case using a Linear shrink has solved the issue.
Correct as long as we always expect users to be able to shrink the result to a small array/object, if we do note and they are using a generator which is more complicated that the ones presented above then, we don't have a bug but there is a performance issue.
I think one of these would be desirable:
"Rechecking re-runs the property for the failing example Hedgehog has found, in most case property execution is really fast but in some circumstances, it may take a short while before your property is executed, while hedgehog calculates the test data from the re-check string. This usually occurs if the shrunken value is large; often indicated by a large recheck string. This is a known issue and we are looking into it" |
That is not possible. Not matter how fast our code is, a user can always make a generator so complicated that I not sure about your other two bullet points. Instead, I want to focus on what @dharmaturtle found.
Definitely. This a performance issue. I think the only thing that
@dharmaturtle, can you share a branch and instructions to reproduce the slow |
master...dharmaturtle:fsharp-hedgehog:recheckPerf A basic |
@dharmaturtle and I think we found the reason for the worse performance. Consider this line fsharp-hedgehog/src/Hedgehog/Property.fs Line 178 in 253634b
The call to |
Also changes the serialized form of the shrink path. Previously, it was a string of 0's and 1's. Now, it is a sequence of dash-separated integers. In terms of the new serialization, the integers in the new sequence count the number of 1's between each consecutive pair of 0's.
@dharmaturtle, can you test this branch? The first, second, and fourth commits change |
Hm, I disagree that this commit is obviously more efficient. Yes, you delay the construction of I added a
So recheck isn't always faster than check, but it's significantly better than before. NCrunch thinks I eta-expanded My first thought was that this was due to reversing the shrinkPath (you gotta iterate the whole list to get to the head). However, this is on Node's I'm not sure if we should try to eliminate these hotspots; just mentioning them in case it sparks something in your mind. Roughly equal check/recheck perf seems fine. |
I like the tests. We could either keep the first and assert something like When you take screenshots of code, always include the line numbers. That That They are slow for the same reason, so this branch fixes the performance issue of |
Also changes the serialized form of the shrink path. Previously, it was a string of 0's and 1's. Now, it is a sequence of dash-separated integers. In terms of the new serialization, the integers in the new sequence count the number of 1's between each consecutive pair of 0's.
Stupid me. I should have created a PR in which to have this review. |
Also changes the serialized form of the shrink path. Previously, it was a string of 0's and 1's. Now, it is a sequence of dash-separated integers. In terms of the new serialization, the integers in the new sequence count the number of 1's between each consecutive pair of 0's.
While trying to create a toy example of using Hedgehog to generate test cases to exercise a FSM, in this case a very simple web journey I noticed that running a re-check with the provided seed was slow (around 6-10 seconds) the code below simulates the problem but the wait is more like a couple of seconds:
Initial posted in #419 as problem seemed similar.
The text was updated successfully, but these errors were encountered: