-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle geo types #11
Comments
Revising this to just
|
I can close this for now since I found a good workaround. Instead of using |
Sorry, I'll reopen the issue :) I like the proposal to add options for ignoring unknown types, or converting them to text. It has the problem of backwards compatibility - when I would add support for a new type, your existing workflow can break since type of some column have changed. I guess it's not a big deal if we don't make it the default, and add warning to the documentation. I think it is a good idea to support the GeoParquet extension, although it looks quite complex, so no promises it will happen anytime soon. I already wanted to add support for points, paths and polygons and serialize them as parquet structs, but postgres only tells me the column type is |
I just came here to ask for the same :) Thank you for your efforts. It would be nice to have a flag to cast unknown types as string. |
@mahmut-spark Great! Are you looking specifically for postgis geography types or other currently unsupported data types? I have recently found a way to extract the exact column type (i.e. What kind of output schema would you prefer? The possibilities I'm aware of are:
Unfortunately, GeoParquet (currently) only supports geography types at the root of the schema. PostGIS definitely allows you to have them in custom types and arrays, meaning that we might have to produce an invalid GeoParquet if the database uses this. |
I’m what curious what you found to get the specific geometry type… |
My use case is Postgis -> Parquet -> DuckDB. I think it will be fine even if the geometry values are cast to text. That being said, I gathered these. Hopefully, it will helpful:
I couldn't find any information on custom geometry types. But arrays yes, and I confirmed from pgadmin as well: And geoparquet seems to expect the format as:
Given these circumstances, I suppose it makes sense to (if possible) use GeoParquet:
What do you think? |
The information is available in the system catalog pg_attribute. This query will list all attributes with the full types: However, it's bit more complicated to get work with arbitrary queries, but the basic info should be extractable from Since it isn't really needed for WKB/WKT output, I this we could have geo support first. I guess the other main blocker is being able to reliably test against postgres with extensions 😅 , I'll have a look at setting up proper CI
Thank you for the detailed info! I'm still not sure about the correct defaults, but I find it important to support both WKB, WKT and maybe the structs in future, so I'll probably just add an option and make it required for now. |
Hi, this is my personal git. If it will be helpful I can put together a docker compose for you to extend postgis. Please let me know. |
I'd like to use this to query a PostGIS database that has geometry types. I see there is a GeoParquet format although I haven't looked too deeply into how well these types match one another. Alternatively, serializing geometry types as strings would also be fine for now.
The text was updated successfully, but these errors were encountered: