-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rewrite evaluable loops using Loop base class #861
Conversation
989e798
to
c9623e6
Compare
81dc9c5
to
b07f60d
Compare
70e7795
to
aaf54c1
Compare
To distinguish inner subgraphs from outer subgraphs, this patch chooses a background color based on the number of parent subgraphs.
`Evaluable`s like `Eig` and `LoopConcatenateCombined` return tuples of arrays. Dependencies use `ArrayFromTuple` to fetch one item of the tuple. For readability `ArrayFromTuple` is hidden from the graph. Currently this is done by looking for a `_node_tuple` method, implemented by `Eig` and `LoopConcatenateCombined`, which returns a tuple of nodes. This patch introduces a `graph.TupleNode` with publicly accessible item nodes, which can be used to merge `Evaluable._node_tuple` into `Evaluable._node`.
This patch replaces `Evaluable._node_tuple` with `Evaluable._node` returning a `TupleNode`.
154b968
to
007ecc3
Compare
Computing the integer bounds can be expensive. For example, a scalar array that does not depend on arguments will be simplified and evaluated, regardless how expensive this evaluation is and disregarding the `_intbounds_impl`. Currently the `ArrayFromTuple` class takes integer bounds as arguments. When combining loops, the integer bounds for the loops are evaluated, the loops are combined and the evaluated integer bounds are passed to `ArrayFromTuple`, even if we are not using the bounds. To prevent evaluating the bounds, this patch introduces the `_intbounds_tuple` property for `Evaluable`s that return tuples and replaces the integer bounds argument from `ArrayFromTuple` with a `_intbounds_impl` that queries the `_intbounds_tuple`.
007ecc3
to
9f18db3
Compare
In a follow-up commit the `LoopSum` will gain support for parallel evaluation, which makes the order of summation non-deterministic. This causes a unittest to fail. This patch lowers the tolerance for the test.
9f18db3
to
d93e6a4
Compare
nutils/evaluable.py
Outdated
for i, loop in enumerate(loops): | ||
kwargs = {} | ||
replacements[loop] = ArrayFromTuple(combined, i, loop.shape, loop.dtype) | ||
self = util.shallow_replace(replacements.get, self) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we wait till the end to apply all replacements at once?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I don't think so. shallow_replace
stops at the shallowest replacement, so we'd miss replacements that are a dependency of the shallowest replacement.
nutils/evaluable.py
Outdated
loop, = loops | ||
combined = loop._combine_loops(loop.body_arg._loop_deps - loop._loop_deps) | ||
if combined != loop: | ||
self = util.shallow_replace(lambda func: combined if func == loop else None, self) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here via replacements
?
d93e6a4
to
9adefa7
Compare
Currently there are 2.5 implementations for evaluable loops: `LoopSum`, `LoopConcatenate` and `LoopConcatenateCombined`. The first loop is the most common one (before sparsification). The second appears after sparsification. The last loop is created only during the 'optimize for numpy' stage and is a combination of several concatenates in a single loop for performance reasons. This patch introduces a base class for evaluable loops and rewrites `LoopSum` and `LoopConcatenate` as implementations of the base class. The base class requires two methods to be implemented: one for initializing the output value and one for updating the output value for each iteration. Due to the generic nature of the base class, the method for updating the output value is guarded with a multiprocessing lock if the loop is evaluated in parallel, even when this is not necessary, e.g. for `LoopConcatenate`. To minimize the impact of locking, the lock used to increment the loop index is reused for updating the the output value. In addition this patch replaces `LoopConcatenateCombined` with `_LoopTuple`, which supports any combination of loop evaluables, not just `LoopConcatenate`.
The shape of `LoopConcatenate` is used in the loop initialization to allocate the output array. The length of the concatenation axis is the result of concatenating and summing (`_SizesToOffset`) the lengths of the same axis of the concatenation value. Under certain conditions the concatenation length can be simplified. For this to work, the concatenation length must be a constructor argument of `LoopConcatenate`, otherwise the length is not picked up by `deep_replace_property`. For the same reason the entire shape of `LoopConcatenate` was passed as constructor argument. Since the shape of an array is very likely already optimized --- the same is true for `LoopConcatenate.shape`! --- this patch removes the shape from the constructor of `LoopConcatenate`, except for the concatenation axis.
9adefa7
to
de07c59
Compare
This PR merges the common parts of the evaluable loops into a base class.