-
Notifications
You must be signed in to change notification settings - Fork 200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
speed up Relate/Contains with an rstar backed edge set intersector #829
Conversation
2b45341
to
207226b
Compare
I might be missing something, but what's up with |
Could we have a user-friendly API for that? I.e. loading a geometry into an R-tree, in order to be able to reuse it. |
|
Copied the "SimpleEdgeSetIntersector" into a new struct. I'll add the actual RTree implementation in a follow up, and hopefully this way it'll be easier to compare the two implementations.
If we want to rely on using an RTree for some of our operations (like the Relate trait) we have to ensure our numeric types are RTree compatible. == Alternative considered Since RTreeNum isn't necessarily a float, we could instead add these new bounds to GeoNum instead of GeoFloat. However, doing so would mean dropping support for unsigned ints from GeoNum. Note that using unsigned ints now, while supported, can easily lead to underflow if you're using one of the many operations that involve subtraction. It would also put one more barrier between ever getting BigDecimal support in geo - which is not Bounded. Also, apparently Float isn't necessarily Signed, but having never personally encountered unsigned floating point in the wild, I don't have strong feelings about retaining support for it. And since Relate doesn't current support non-floats, this would be a cost with no benefit. If that changes, we could reconsider this decision, or perhaps add the required behavior to some derivative type, like one of the HasKernel implementations.
207226b
to
68a4c40
Compare
Thank you for catching this @lnicola. I've amended the commit to remove it. |
I think @urschrei's proposal for some kind of pre-indexed geometry makes sense for this. I'm happy to work on that unless someone else is planning on it. I'd prefer it to be a followup PR though, as I feel this one stands as a net win on its own. |
Yeah, of course, no need to block this one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
bors r=rmanoka |
Build succeeded: |
CHANGES.md
if knowledge of this change could be valuable to users.Fixes #649 (includes some good context too)
Please note that I'm changing the bounds on
GeoFloat
to be compatible withRTreeNum
. I think it's unlikely that anyone is using unbounded or unsigned Floats, but let me know if you think otherwise.Perf Highlights:
Larger overlapping geometries can hugely benefit over the naive O(n2)version — for example an ~80x speedup in these two:
But not everything is faster. In particular, small geometries don't benefit much from the lower
O(nlg(n))
, while still paying the tax of loading the RTree.The JTS test suite has a bunch of tests, but they're almost all comparing small geometries with other small geometries:
Perhaps the worst case is comparing a big geometry to a small one. These used to be pretty fast due to a small
n
, but now, because we have to pay the tax of loading the big geometry into the RTree, but only perform a small number of queries against it, it results in a ~5x loss:Despite some of the regression with small geometries, I think this change is likely to be a big win for the kinds of operations people are likely to do in the real world. But I wanted to include these benched regressions to show some of the tradeoffs and opportunities for future work.
Full bench output