Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

h3_polygon_to_cells: Handling of invalid geometry: 5 point geometry = 2,364,092 hex_ids #926

Open
jmealo opened this issue Oct 10, 2024 · 0 comments

Comments

@jmealo
Copy link

jmealo commented Oct 10, 2024

Hello,

Some invalid geometries have made it into our dataset and are wreaking a bit of havoc in our H3 indexes.

  • h3_polygon_to_cells in the latest version of h3 returns 2,364,092 hex_ids for one of them.
  • These invalid geometries have 5 points, two points appear multiple times.
  • The geometry starts and ends at the same point.
  • We're using the latest h3-pg which uses h3 4.x
  • ST_MakeValid() function turns these into a bow-tie shape.
  • ST_IsValid() returns false.

I'm not sure if h3 should handle any of this or if we should always validate geometry before calling into it. Any thoughts?

Additional context:
We're using a flood fill strategy to convert polygons to h3 cells (the results of which are similar to Snowflake's h3_coverage). This strategy results in 50% less hex_ids on invalid inputs. This was still way too many, and before we found the root cause, we decided to test our implementation against h3_polygon_to_cells,. We were very surprised to see that we got double the hex_ids.

Test case:

{
  "type": "Polygon",
  "coordinates": [
    [
      [-148.5, 29.1],
      [-148.5, 63.9],
      [-72.5, 29.1],
      [-72.5, 63.9],
      [-148.5, 29.1]
    ]
  ]
}
WITH h3_cells AS (
  SELECT h3_polygon_to_cells(
    ST_GeomFromGeoJSON('{
      "type": "Polygon",
      "coordinates": [
        [
          [-148.5, 29.1],
          [-148.5, 63.9],
          [-72.5, 29.1],
          [-72.5, 63.9],
          [-148.5, 29.1]
        ]
      ]
    }'),
    7
  ) AS cells
)
SELECT COUNT(1) 
FROM h3_cells;
# count = 2,364,092
@jmealo jmealo changed the title h3_polygon_to_cells: Handling of invalid geometry: 5 point polygon = 2.3 million hex_ids h3_polygon_to_cells: Handling of invalid geometry: 5 point polygon = 2,364,092 hex_ids Oct 10, 2024
@jmealo jmealo changed the title h3_polygon_to_cells: Handling of invalid geometry: 5 point polygon = 2,364,092 hex_ids h3_polygon_to_cells: Handling of invalid geometry: 5 point geometry = 2,364,092 hex_ids Oct 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant