Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Segmentation fault in cudf::conditional_inner_join with Zero-Row Input #16066

Closed
aocsa opened this issue Jun 24, 2024 · 3 comments · Fixed by #16094
Closed

[BUG] Segmentation fault in cudf::conditional_inner_join with Zero-Row Input #16066

aocsa opened this issue Jun 24, 2024 · 3 comments · Fixed by #16094
Labels
bug Something isn't working libcudf Affects libcudf (C++/CUDA) code.

Comments

@aocsa
Copy link
Contributor

aocsa commented Jun 24, 2024

Describe the bug

There is a bug in cuDF where the cudf::conditional_inner_join operation triggers a Segmentation fault when one of the input values has num_rows = 0. This error occurs during the execution of the join operation.

Steps/Code to reproduce bug

  • Create two cuDF DataFrames, one with zero rows.
  • Perform cudf::conditional_inner_join.

--- a/cpp/tests/join/conditional_join_tests.cu
+++ b/cpp/tests/join/conditional_join_tests.cu
@@ -222,7 +222,10 @@ struct ConditionalJoinPairReturnTest : public ConditionalJoinTest<T> {
              std::vector<std::pair<cudf::size_type, cudf::size_type>> expected_outputs)
   {
     auto result_size = this->join_size(left, right, predicate);
-    EXPECT_TRUE(result_size == expected_outputs.size());
+    std::cerr << "result_size: " << result_size << std::endl;
+    std::cerr << "expected_outputs: " << expected_outputs.size() << std::endl;
+    // EXPECT_TRUE(result_size == expected_outputs.size());
     auto result = this->join(left, right, predicate);
     std::vector<std::pair<cudf::size_type, cudf::size_type>> result_pairs;
 };
 
+
+TYPED_TEST(ConditionalInnerJoinTest, TestLeftColumnIsEmpty)
+{
+  this->test({{}}, {{0}}, left_zero_eq_right_zero, {{}});
+};
+
+TYPED_TEST(ConditionalInnerJoinTest, TestRightColumnIsEmpty)
+{
+  this->test({{0}}, {{}}, left_zero_eq_right_zero, {{}});
+};
+
result_size: 0
expected_outputs: 1
Segmentation fault (core dumped)

Note: cudf::conditional_inner_join_size works fine, but cudf::conditional_inner_join ends with a segmentation fault.

Expected behavior

Return an empty DataFrame or a specific error.

Environment details

Method of cuDF install: source code
v24.06.00 branch release

@aocsa aocsa added the bug Something isn't working label Jun 24, 2024
@lithomas1 lithomas1 added the libcudf Affects libcudf (C++/CUDA) code. label Jun 24, 2024
@bdice
Copy link
Contributor

bdice commented Jun 24, 2024

@aocsa Thank you so much for the tests to reproduce this failure. I know that is a lot of work but it helps us tremendously.

I will start investigating this.

@bdice
Copy link
Contributor

bdice commented Jun 25, 2024

@aocsa I did some investigation. First, the tests seem to cover the empty case already:

TYPED_TEST(ConditionalInnerJoinTest, TestOneColumnLeftEmpty)
{
this->test({{}}, {{3, 4, 5}}, left_zero_eq_right_zero, {});
};

Also, the test cases you provided seem to have a bug. Note that the expected value in the snippet above ({{}}) initializes to {{0, 0}}. This is a vector of a single pair of indices, rather than an empty vector (which is what we should have for an empty join result). Here's a minimal reproducer: https://godbolt.org/z/ErndbYK4v

If I fix your test cases to use {}, I do not get any test failures or segfaults.

For now, I have identified some improvements (additional missing test cases, etc.) and I am trying to track down how the test case you proposed (with {{}}) got to the point of causing a segfault. I will follow up with anything else I discover and will file a PR for those improvements.

@bdice
Copy link
Contributor

bdice commented Jun 25, 2024

I found a bug that would cause a segfault and have fixed it in #16094. When the right table has zero rows, conditional left anti-joins were returning a vector of indices containing garbage data. This is now corrected.

That PR will be evaluated for inclusion in a 24.06.01 hotfix release with some other fixes -- to be determined.

@rapids-bot rapids-bot bot closed this as completed in 65b64f6 Jun 26, 2024
bdice added a commit to bdice/cudf that referenced this issue Jun 26, 2024
Closes rapidsai#16066.

I found a bug that would cause the reported segfault and have fixed it in this PR. When the right table has zero rows, conditional left anti-joins were returning a vector of indices containing garbage data.

Along the way, I refactored several parts of the conditional join tests and added coverage for more cases.

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - Nghia Truong (https://github.com/ttnghia)
  - Vyas Ramasubramani (https://github.com/vyasr)
  - Yunsong Wang (https://github.com/PointKernel)

URL: rapidsai#16094
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working libcudf Affects libcudf (C++/CUDA) code.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants