-
Notifications
You must be signed in to change notification settings - Fork 200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add PreparedGeometry
to speed up repeated Relate
operations.
#1197
Conversation
f9b4e73
to
6683a92
Compare
This looks very promising. I'm trying this out in: https://github.com/gauteh/roaring-landmask/blob/georust-prepared/src/shapes.rs#L22 , but I'm not able to share the PreparedGeometry'ies between threads (through pyclass). I've also put the Polygons of the MultiPolygon in an RTree (not sure if that is necessary anymore..?). |
I don't think you need to do this… |
6683a92
to
a078405
Compare
Much of the cost of the `Relate` operation comes from finding all intersections between all segments of the two geometries. To speed this up, the edges of each geometry are put into an R-Tree. We also have to "self node" each geometry, meaning we need to split any segments at the point that they'd intersect with an edge from the other geometry. None of that work is new to this commit, but what is new is that we now cache the self-noding work and the geometry's r-tree. relate prepared polygons time: [49.036 ms 49.199 ms 49.373 ms] relate unprepared polygons time: [842.91 ms 844.32 ms 845.82 ms]
a078405
to
9452543
Compare
I've tried to change all the spots with RefCell (which can't just be put in an Arc). Eventually that will require a mut GeometryGraph which will prevent parallelism anyway. So far I've only found swap_labels that requires a mutable GeometryGraphy. Maybe the best solution is add another Label (https://github.com/georust/geo/pull/1197/files#diff-b64c206f2b566eef304be445dd47e93d22b59dd11b986e715b8152a367e862b1L22) for the other direction? Or some method on labels that swap them/reverse them only for the operation, but not changing the object. I assume the latter is not so good since they are already cached for a reason. |
PreparedGeometry is not Send because GeometryGraph is not send. It seems like something worth addressing, but arguably out of scope for this PR. |
Thanks for taking a look at this. swap_labels only happens during construction - so there shouldn't be an issue here with mutating anything since no other thread could have a reference to the mutable thing we've just constructed. Right?
|
I am totally unfamiliar with the code, however: From the docs it seems that swap_labels is done depending on whether you do a.relate(b) or b.relate(a), which is not known in advance? |
On the other hand, it is weird that this works before this PR? Maybe it is not related. |
Correct, but The whole point of PreparedGeometry is to keep that entity around for re-use. With this PR it works in a single threaded context. I'd propose that we leave that as the bar for now and follow up with "share a prepared geometry across threads" as future work since this PR is already pretty large. Plus, that way we can have a focused study of the performance tradeoffs needed (e.g. Rc vs Arc). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whoops - meant to leave these comments to aid review when I first opened.
nodes: NodeMap::new(), | ||
isolated_edges: vec![], | ||
line_intersector: RobustLineIntersector::new(), | ||
} | ||
} | ||
|
||
pub(crate) fn compute_intersection_matrix(&mut self) -> IntersectionMatrix { | ||
let mut intersection_matrix = IntersectionMatrix::empty(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved this into the empty_disjoint
constructor.
@@ -293,44 +285,6 @@ where | |||
} | |||
} | |||
|
|||
/// If the Geometries are disjoint, we need to enter their dimension and boundary dimension in | |||
/// the `Outside` rows in the IM | |||
fn compute_disjoint_intersection_matrix(&self, intersection_matrix: &mut IntersectionMatrix) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved this to InterseciontMatrix::compute_disjoint
method.
}; | ||
} | ||
|
||
macro_rules! cartesian_pairs_helper { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got rid of this awful sophisticated 🧐 macro and now the relate impl is generic - anything that implements Relate
can now relate to anything else that implements Relate
.
} | ||
} | ||
|
||
struct Segment<'a, F: GeoFloat + rstar::RTreeNum> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved Segment
to its own file.
Absolutely! I've been chipping away at moving to just &mut, don't think we need the Arc. There are a couple of cases where it's not possible to exactly replicate the behavior, but I suspect those cases may be not totally correct (or at least redundant). I'll put it in a separate PR for reference (whether it is usable or not). |
CHANGES.md
if knowledge of this change could be valuable to users.Fixes #803
Much of the cost of the
Relate
operation comes from finding allintersections between all segments of the two geometries.
To speed this up, the edges of each geometry are put into an R-Tree.
We also have to "self node" each geometry, meaning we need to split any
segments at the point that they'd intersect with themselves.
None of that work is new to this commit, but what is new is that we now
cache the self-noding work and the geometry's r-tree.