Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Job splitters should retain original geometries #153

Merged
merged 3 commits into from
Sep 2, 2024

Conversation

VincentVerelst
Copy link
Collaborator

@kvantricht, updated both split_job_hex and split_job_s2grid so that they now:

  • Retain the CRS of the original geodataframe
  • Retain the geometries of the original geodataframe
  • Use Web Mercator instead of lat lon to calculate the centroids, so that the warnings are gone

Also added unit tests for both.

Copy link
Collaborator

@kvantricht kvantricht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one remark on the speed of reprojecting s2 grid.


# Dataset containing all the S2 tiles, find the nearest S2 tile for each point
s2_grid = load_s2_grid()
s2_grid = s2_grid.to_crs(epsg=3857)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this not a slow step for the entire s2 grid?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@VincentVerelst Maybe if this is a speed issue you could upload a second s2 grid in the artifactory with the CRS reprojected, and then change the load_s2_grid() function in gfmap to accept a parameter web_mercator: bool = False

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be worth the effort indeed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion, I'll already update that.

Do have to note that the largest bottleneck is typically not the CRS conversion of the s2 grid, but rather the CRS conversion of the dataframe to split itself.

For example, for 2018_BEL_LPIS-Flanders_POLY_110 the old job_splitter takes 10s to run, while the new one takes 20s.

@GriffinBabe GriffinBabe merged commit 1110e4a into main Sep 2, 2024
2 checks passed
@GriffinBabe GriffinBabe deleted the job-splitter-centroids branch September 2, 2024 11:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants