-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Aggregation fuzz testing #12114
Comments
I agree SQLancer is not the best choice for aggregation-specific fuzzing (though doable), due to:
So now I plan to cover more SQL features and try to find easy to identify and fix bugs, configuration fuzzing is less prioritized for SQLancer So I think rust-level fuzzing is better. Besides, I think we can also find some comprehensive aggregation queries to do some SQL level fuzzing (Fixed SQL + random config, and check under different config the query always gives the same result) |
I am also curious what is the compatible matrix for all aggregation optimizations (like can skip-partial-aggregation and external-aggregation triggered in the same execution, for all combinations) |
In my knowledge, it may be:
|
As I think, can we run the basic aggregation without any optimizations enabled and use its output as expected first, |
Yes, I think that is likely a good plan. In my mind, as long as all the code paths get the same answer that will increase our confidence that the system is computing the correct results in the different places |
Ok, maybe just start from making a simple sketch, and try to impl current aggr fuzz tests based on it? I can have a try on it, and help to push forward about enabling #11943 by default, |
Thank you -- that would be awesome. I can't keep up anymore with everything that is going on In terms of helping along DataFusion performance, my plan was to focus first on getting StringView enabled and then switch more to focusing on the blocked intermediate state. I will however, prioritize time for reviewing aggregation testing as I think testing in general is really important for DataFusion |
take |
@Rachelint has made a great start here: #12667 What would you suggest the next steps here be @Rachelint ? Do you want to fill out the coverage? Would it be helpful if I did? |
I personally plan to implement some necessary features of the framework firstly, like:
It will surely be helpful! It may be a help wanted work for me. |
I will try and do so over the next few days. Thanks @Rachelint |
This fuzzer framework looks great!
And I want to work on this feature if nobody else take it. @Rachelint |
Really thanks, just feel free to do it |
Update here:
Here is a list of additional coverage I think is needed
@LeslieKid perhaps you could make a PR based on #12847 for one of those items (StringView or Decimal or Date type would be super great) |
OK! I will work on adding some new types for this framework in the next few days. And I think maybe we can introduce a new trait named |
That would be great 🙏 |
@LeslieKid added time/interval/ decimal/utf8view in #13226 Additional types that would be good to cover are:
Any chance you are interested in doing that too @LeslieKid ? If not no worries I can file a ticket and I bet others can follow your good example |
🤔The
Sorry, I'm currently unable to take that on at the moment. I think filing a ticket is a good way to forward. Thanks @alamb |
Now that we have most of the datatypes filled in, perhaps we can start adding coverage for the other aggregation functions. Like |
This sounds good. I can probably try to start working on some of them, should we aim to cover the entire list of aggregate functions that are in datafusion? |
Is your feature request related to a problem or challenge?
While reviewing #11943 from @Rachelint it is becoming clear to me that the hash aggregate code is now pretty sophisticated and I am not sure our testing has kept up. In fact I couldn't come up with a great way to systematically test the new code added in #11943
Also, the code in #11627 from @korowa for skipping partial aggregates has a similar problem as it is not invoked There is also code for streaming and partial streaming group by.
All this code has unit tests, but I am not confident that all the combinations are checked. For example the code paths are affected by:
Describe the solution you'd like
I would like a more systematic way to test this code to ensure out current code is correct but also to ensure that future changes do not introduce subtle hard to debug regressions / wrong results
Describe alternatives you've considered
What I think would be good is a test framework that:
Parameters to randomly vary for each input:
Test cases:
2. Types of the group keys
2. single/multiple column groups
3. Number of groups (low/high cardinality)
4. Different aggregates
Additional context
We also have some great sql fuzz coverage in https://github.com/datafusion-contrib/datafusion-sqlancer from @2010YOUY01, but I think that focuses on the queries themselves, rather than the setup (block size, input order, etc)
Existing aggregate coverage in datafusion core
fuzz
test (cargo test --test fuzz
datafusion/datafusion/core/tests/fuzz_cases/distinct_count_string_fuzz.rs
Lines 33 to 34 in 3c2b542
datafusion/datafusion/core/tests/fuzz_cases/aggregate_fuzz.rs
Lines 48 to 49 in e088945
Subtasks
Timestamp
,Binary
andFloat
#13279The text was updated successfully, but these errors were encountered: