-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is there any interest in adding UUIDs? #45
Comments
I don’t really have plans for cyberpandas itself.
There have been some discussions about a numpy-backed extension array mixin, this provides an implementation of things like take, but I don’t know the exact issue.
If you can’t find an issue you might open one in pandas itself.
… On Oct 16, 2020, at 17:12, Steve Simmons ***@***.***> wrote:
I was looking for a good, compact and efficient way to store UUIDs in Pandas DataFrames. The easy way is as columns of uuid.UUID objects (56 bytes each). Since UUIDs can be represented as 128 bits (16 bytes), it would be nice for a column to be a contiguous array.
As the cyberpandas IPv6 extension array also stores 128 bit wide IP addresses, I was thinking of leveraging the work done here for IPv6 for UUIDs.
Then a future potential step would be to make an extension type that supports any numpy "Sn" fixed width field, with efficient implementations of the low level Pandas array operations, plus a mechanism to easily register various high-level representation and accessor methods (e.g. IPv6, UUID, and so forth).
Tom, maybe can you say how you see this project evolving? Is it essentially "done" as it is today, with IPv4 and IPv6. Or as a place where similar extension arrays can be added, as semi-standard additions to the Pandas ecosystem?
Thanks
Stephen
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
We’d also be interested in that. One question: Why did you go with cc @ivirshup |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I was looking for a good, compact and efficient way to store UUIDs in Pandas DataFrames. The easy way is as columns of uuid.UUID objects (56 bytes each). Since UUIDs can be represented as 128 bits (16 bytes), it would be nice for a column to be a contiguous array.
As the cyberpandas IPv6 extension array also stores 128 bit wide IP addresses, I was thinking of leveraging the work done here for IPv6 for UUIDs.
Then a future potential step would be to make an extension type that supports any numpy "Sn" fixed width field, with efficient implementations of the low level Pandas array operations, plus a mechanism to easily register various high-level representation and accessor methods (e.g. IPv6, UUID, and so forth).
Tom, maybe can you say how you see this project evolving? Is it essentially "done" as it is today, with IPv4 and IPv6. Or as a place where similar extension arrays can be added, as semi-standard additions to the Pandas ecosystem?
Thanks
Stephen
The text was updated successfully, but these errors were encountered: