Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove cudf._lib.json in favor of inlining pylibcudf #17443

Merged
merged 4 commits into from
Nov 28, 2024

Conversation

mroeschke
Copy link
Contributor

Description

Contributes to #17317

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@mroeschke mroeschke added Python Affects Python cuDF API. improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Nov 26, 2024
@mroeschke mroeschke self-assigned this Nov 26, 2024
@mroeschke mroeschke requested a review from a team as a code owner November 26, 2024 00:20
@github-actions github-actions bot added the CMake CMake build issue label Nov 26, 2024
Copy link
Contributor

@Matt711 Matt711 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of questions

python/cudf/cudf/io/json.py Outdated Show resolved Hide resolved
python/cudf/cudf/io/json.py Show resolved Hide resolved
Comment on lines 35 to 49
def _update_col_struct_field_names(
col: ColumnBase, child_names: dict
) -> ColumnBase:
if col.children:
children = list(col.children)
for i, (child, names) in enumerate(
zip(children, child_names.values())
):
children[i] = _update_col_struct_field_names(child, names)
col.set_base_children(tuple(children))

if isinstance(col.dtype, cudf.StructDtype):
col = col._rename_fields(child_names.keys()) # type: ignore[attr-defined]

return col
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it is OK that we update names in-place, but it is a minor potential footgun for two columns that share data (but in theory could have different names).

Copy link
Contributor

@Matt711 Matt711 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment and one suggestion. Both are non-blocking

compression="infer",
byte_range=None,
keep_quotes=False,
byte_range: None | list[int] = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nit-picky. Do you have a preference?

Suggested change
byte_range: None | list[int] = None,
byte_range: list[int] | None = None,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a preference in particular. I agree that your suggestion looks better

) -> None:
for name, child_names in child_names_dict.items():
col = df._data[name]
df._data[name] = _update_col_struct_field_names(col, child_names)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just thinking out loud based on @wence- review. We probably don't want to eagerly copy, so maybe we can optionally copy if the column and children share data?

@mroeschke
Copy link
Contributor Author

/merge

@rapids-bot rapids-bot bot merged commit 9b88794 into rapidsai:branch-25.02 Nov 28, 2024
105 checks passed
@mroeschke mroeschke deleted the cudf/_lib/json branch November 28, 2024 01:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CMake CMake build issue improvement Improvement / enhancement to an existing function non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants