-
Notifications
You must be signed in to change notification settings - Fork 172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Auto-wrapper for vaex? #1026
Comments
I'm no expert on Vaex (had it in my sights for a while now but never had the time to explore it much);
Let me know if I'm off-course on this! A nice first step I think would be to have a working POC for a very simple pyjanitor function (say, |
All-round great pointers, @thatlittleboy! Yes, I agree a small prototype might be a good starting point. I might take my time on this one, as it is fairly low-priority in the grand scheme of things; our effort on #972 is currently more important. On the specific questions you raised:
Not sure here either. I guess a prototype done using the most idiosyncratic pandas' functions would be the way to know! |
From what I see, the Ah. maybe the dataframe accessors might work.. worth a shot regardless. I can't quite tell just by looking 😄 Agreed on the point on the need to wrap if we somehow get the Vaex extensions to work; I'm strongly of the opinion we need to keep the internal (pyjanitor) API consistent, regardless of the DataFrame type. |
Brief Description
I recently read the vaex docs and it looks quite promising for highly scalable dataframe computation. I'd like to kickstart a discussion on what it might take to support vaex with pyjanitor.
Wanted to also make explicit that we don't have to decide on yes/no for this idea!
The
vaex
docs are available here. Because a lot of our underlying codebase operates on thepandas
API, andvaex
is supposed to be dataframe API compatible, it appears to me that we should be able to automagically wrap functions in our top-level API and have them "just work" under thedf.func
namespace.The text was updated successfully, but these errors were encountered: