Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add LazyFrame.unpivot for spark and duckdb #1890

Merged
merged 4 commits into from
Feb 2, 2025

Conversation

FBruzzesi
Copy link
Member

What type of PR is this? (check all applicable)

  • πŸ’Ύ Refactor
  • ✨ Feature
  • πŸ› Bug Fix
  • πŸ”§ Optimization
  • πŸ“ Documentation
  • βœ… Test
  • 🐳 Other

Checklist

  • Code follows style guide (ruff)
  • Tests added
  • Documented the changes

If you have comments or can explain your changes, please do so below

@FBruzzesi FBruzzesi added enhancement New feature or request pyspark Issue is related to pyspark backend duckdb Issue is related to duckdb backend labels Jan 28, 2025
Comment on lines +417 to +423
if variable_name == "":
msg = "`variable_name` cannot be empty string for duckdb backend."
raise NotImplementedError(msg)

if value_name == "":
msg = "`value_name` cannot be empty string for duckdb backend."
raise NotImplementedError(msg)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried, but it is not of duckdb liking to support these

@MarcoGorelli
Copy link
Member

thanks, can you rebase please?

Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome work, thanks!

@@ -312,3 +312,21 @@ def join(
return self._from_native_frame(
self_native.join(other, on=left_on, how=how).select(col_order)
)

def unpivot(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wow, so simple!

)

if on_ is None:
on_ = [c for c in self.collect_schema().names() if c not in index_]
Copy link
Member Author

@FBruzzesi FBruzzesi Feb 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MarcoGorelli by pushing one level up, we (might) need column names πŸ™ƒ

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah sorry i don't think we need to push this one, just the str | list[str] ones

Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @FBruzzesi ! feel free to merge on green

got a question which occurred to me, we can always simplify later

Comment on lines +342 to +343
variable_name = variable_name if variable_name is not None else "variable"
value_name = value_name if value_name is not None else "value"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

πŸ€” I don't understand why the signature isn't variable_name: str = "variable" in Polars, am I missing something?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! Not sure why they do it like that

narwhals/dataframe.py Outdated Show resolved Hide resolved
@FBruzzesi FBruzzesi merged commit d48b8a3 into main Feb 2, 2025
22 of 23 checks passed
@FBruzzesi FBruzzesi deleted the feat/unpivot-spark-duckdb branch February 2, 2025 15:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duckdb Issue is related to duckdb backend enhancement New feature or request pyspark Issue is related to pyspark backend
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants