You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We would like to learn about your use case. For example, if this feature is needed to adopt Narwhals in an open source project, could you please enter the link to it below?
I am abstracting a library for computing teaching metrics so that researchers can use their data processing library of choice. Narwhals seems like a good bet (also shout-out to @mikeckennedy for having you on the podcast!). I can't share the specific repository because it contains internal scripts, but this would be supporting CourseKata, a low-cost textbook platform dedicated to continuous improvement based on learning science principles.
Please describe the purpose of the new feature or describe the problem to solve.
I would like support for the polars.Expr.rank method. One example of how it could be used is to count how often an instructor teaches, given some grouping variable (window). In Polars it might look like this:
This would window over instructor_id and get the rank by academic_year. Essentially, we will get a count of how many academic years an instructor has taught in, and because we are using the "dense" ranking, teaching multiple classes in a year counts as a single year taught.
Suggest a solution if possible.
No response
If you have tried alternatives, please describe them below.
I could probably achieve this by making an intermediate data frame where I filter down academic_year using unique(), and then make some kind of counter variable based on instructor_id, and then join() that back to the initial table.
Instead I would rather just go back to using Polars until this feature is supported (if it is on your roadmap!).
Additional information that may help us understand your needs.
No response
The text was updated successfully, but these errors were encountered:
Hey @adamblake , thanks for the feature request. This is definitly in scope 👌 we are currently finalizing an integration, but we will get soon back to expanding the API 😁
Hey @adamblake , I started to take a look. Just for context I would like to mention that we will be able to fully support rank for pandas and polars, while for pyarrow there could be some shortcomings. Namely:
the default method for polars method="average" is the only one not supported in arrow
pyarrow TableGroupBy.aggregate does not support ranking in any form. I see in your example that you would like to use in a over context, which for pandas and pyarrow is equivalent to performing a group by and join, therefore this won't be supported for pyarrow.
We would like to learn about your use case. For example, if this feature is needed to adopt Narwhals in an open source project, could you please enter the link to it below?
I am abstracting a library for computing teaching metrics so that researchers can use their data processing library of choice. Narwhals seems like a good bet (also shout-out to @mikeckennedy for having you on the podcast!). I can't share the specific repository because it contains internal scripts, but this would be supporting CourseKata, a low-cost textbook platform dedicated to continuous improvement based on learning science principles.
Please describe the purpose of the new feature or describe the problem to solve.
I would like support for the
polars.Expr.rank
method. One example of how it could be used is to count how often an instructor teaches, given some grouping variable (window). In Polars it might look like this:This would window over
instructor_id
and get the rank byacademic_year
. Essentially, we will get a count of how many academic years an instructor has taught in, and because we are using the "dense" ranking, teaching multiple classes in a year counts as a single year taught.Suggest a solution if possible.
No response
If you have tried alternatives, please describe them below.
I could probably achieve this by making an intermediate data frame where I filter down
academic_year
usingunique()
, and then make some kind of counter variable based oninstructor_id
, and thenjoin()
that back to the initial table.Instead I would rather just go back to using Polars until this feature is supported (if it is on your roadmap!).
Additional information that may help us understand your needs.
No response
The text was updated successfully, but these errors were encountered: