-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GDS Projection from polars / pandas dataframe or arrow table #654
Comments
Duplicate of #653 |
While projection and export are two distinct feature in GDS and its representation in the GDS Python Client, the question of what kind of DataFrame libraries they accept is seen as a global integration. If we added support for Polars, it should apply for both export and projections. For now, the same workaround to convert to/from pandas data frames will assist workflows based on Polars. My discussion in the other issue applies similarly for projection, where we make use of |
Since, polars and pandas ( at least the recent version ) and GDS, and so much more on the market all shared one thing in commun, apache arrow format, would it be a solution to simply import and export ( optionally or as a default behavior ) in arrow table format ? This will makes GDS agnostic the to engine processing the data before they are shipped into or out of GDS? Thanks |
It is a possibility. But the But we can have a polymorphic parameter set, and allow passing in |
At the moment it's probably not critical but over time if there are
multiple engines on the market using apache arrow it could be a quicker way
to make GDS agnostic to these engines.
Polars / Pandas are for now the main ones I know, polars seriously kicking
the ass of pandas.
…On Tue, Jun 11, 2024 at 6:27 AM Mats Rydberg ***@***.***> wrote:
It is a possibility. But the pyarrow.Table type is not as ubiquitous as
the pandas.DataFrame type. It is nice to have DataFrame in the API.
But we can have a polymorphic parameter set, and allow passing in
pyarrow.Table objects directly. It would be some work to accomplish, but
it would be possible.
—
Reply to this email directly, view it on GitHub
<#654 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AHIBVDPLHBDBZWDK4IAHPXLZG3GJBAVCNFSM6AAAAABIWRQEOWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRQGM4DQNZRGE>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
The new Polars dataframe multi engine is absolutely a must in the data industry.
After using it for months, the performance benifits are insane, adios pandas, your time has come.
Allowing GDS to export as and create projections from polars dataframes would be natural today. ( At least once Polars will be out of Alpha )
Even better, GDS being based on apache arrow, I think it would make sens for GDS to create projection directly from an arrow table ? This will makes it agnostic to the engine processing the data.
The text was updated successfully, but these errors were encountered: