Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable the predicate pushdown pytest affected by Arrow v19 stats incompatibility bug. #17806

Open
mhaseeb123 opened this issue Jan 24, 2025 · 0 comments
Assignees
Labels
0 - Blocked Cannot progress due to external reasons

Comments

@mhaseeb123
Copy link
Member

Enable the disabled portion of the test_parquet_bloom_filter pytest which aborts due to an Arrow v19 incompatibility issue apache/arrow#45283 with stats.

The arrow bug has been fixed by PR apache/arrow#45285 but there isn't yet an Arrow release containing this PR. Once there is one available and cuDF is bumped to it, we can revert the said pytest back to the following:

def test_parquet_bloom_filters(
    datadir, stats_fname, bloom_filter_fname, predicate, expected_len
):
    fname_bf = datadir / bloom_filter_fname
    df_bf = cudf.read_parquet(fname_bf, filters=predicate).reset_index(
        drop=True
    )

    fname_stats = datadir / stats_fname
    df_stats = cudf.read_parquet(fname_stats, filters=predicate).reset_index(
        drop=True
    )

    # Check if tables equal
    assert_eq(
        df_stats,
        df_bf,
        )
    # Check for table length
    assert_eq(
        len(df_bf),
        expected_len,
    )

See comment by @mhaseeb123 in #17422 (comment) for more details.

@mhaseeb123 mhaseeb123 added the 0 - Blocked Cannot progress due to external reasons label Jan 24, 2025
@mhaseeb123 mhaseeb123 self-assigned this Jan 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0 - Blocked Cannot progress due to external reasons
Projects
Status: No status
Development

No branches or pull requests

1 participant