-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat[next][dace]: Dace fieldview transformations #1594
feat[next][dace]: Dace fieldview transformations #1594
Conversation
sdfg=sdfg, | ||
) | ||
|
||
def _rewire_map_scope( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this function doesn't need to be a method because it doesn't use self
, it should be declared as a module-level function outside of the class. Check here: https://google.github.io/styleguide/pyguide.html#217-function-and-method-decorators
Never use staticmethod unless forced to in order to integrate with an API defined in an existing library. Write a module-level function instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right it does not need to be a method. However, making it a free function feels wrong to me, as the functionality is now split, as some parts are implemented in the class and one (super important one) is a free function. For me this does not make any sense.
For me a class group stuff together and now you tear it apart. It would be different, if the free function could potentially be used somewhere else, but the function is super specific.
|
||
return (independent_nodes, dependent_nodes) | ||
|
||
def classify_node( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this function doesn't need to be a method because it doesn't use self
, it should be declared as a module-level function outside of the class. Check here: https://google.github.io/styleguide/pyguide.html#217-function-and-method-decorators
Never use staticmethod unless forced to in order to integrate with an API defined in an existing library. Write a module-level function instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my point above regarding _rewire_map_scope()
. However, there is an inconsistency it should be private.
src/gt4py/next/program_processors/runners/dace_fieldview/transformations/loop_blocking.py
Show resolved
Hide resolved
src/gt4py/next/program_processors/runners/dace_fieldview/transformations/util.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Final round of comments. Please merge main
back in the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reviewed the ADR page.
- [Note 2]: It is allowed for an _intrastate_ transformation to act in a way that allows state fusion by later intrastate transformations. | ||
- [Note 3]: The DaCe simplification pass violates this rule, for that reason this pass must always be called on its own, see also rule 2. | ||
|
||
2. It is invalid to call the simplification pass directly, i.e. the usage of `SDFG.simplify()` is not allowed. The only valid way to call _simplify()_ is to call the `gt_simplify()` function provided by GT4Py. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I propose to apply this rule in the current PR; I mean you can replace the call to simplify
, currently used in build_sdfg_from_gtir
, with the gt_simplify
method. Alternatively, hopefully soon when we merge #1623, we can remove the call to simplify
and instead call the optimization workflow from DaCeTranslator
class (see src/gt4py/next/program_processors/runners/dace_fieldview/workflow.py)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The file you mentioned does not seem to exists. But I have removed all occurrences of simplify.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It exists in the open PR. Ok, we can call the optimization in the next PR.
docs/development/ADRs/0018-Canonical_SDFG_in_GT4Py_Transformations.md
Outdated
Show resolved
Hide resolved
- [Note]: To prevent some issues caused by the violation of rule 4 by _simplify()_, this set is extended with the transient sink nodes and all scalars. | ||
Excess interstate transients, that will be kept alive that way, will be removed by later calls to _simplify()_. | ||
|
||
10. Every AccessNode within a map scope must refer to a data descriptor whose lifetime must be `dace.dtypes.AllocationLifetime.Scope` and its storage class should _preferably_ be `dace.dtypes.StorageType.Register`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the storage_type
: I always use default storage while building the SDFG. I believe it is responsibility of the transformations to set the optimal storage_type
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is why the word preferably is used.
If you set it to default then it will be turned to register anyway.
The main point of this rule, it to ensure that no temporary variable that is used inside a map is stored in global memory.
I changed the wording to also allow default storage location.
docs/development/ADRs/0018-Canonical_SDFG_in_GT4Py_Transformations.md
Outdated
Show resolved
Hide resolved
docs/development/ADRs/0018-Canonical_SDFG_in_GT4Py_Transformations.md
Outdated
Show resolved
Hide resolved
docs/development/ADRs/0018-Canonical_SDFG_in_GT4Py_Transformations.md
Outdated
Show resolved
Hide resolved
…ions.md Co-authored-by: Enrique González Paredes <[email protected]>
cscs-ci run default |
It looks like the tests you were trying to fix are still failing... |
The error is the DaCe loading code. Let's use only general names to see what is happening.
Also added a helper function.
Initial PR for the Optimization pipeline of the DaCe backend in GT4Py. The PR is very basic, but contains the basis for later advancement. Essentially, it defines the `gt_auto_optimize()` function which performs optimization on a basic heuristic. There are some parts missing, such as: - There is no way to guide the optimization, i.e. a decision making process that determine if a map should be fused or not. - There is also no transformation which fuses parallel maps (was not needed for this kernel). --------- Co-authored-by: Edoardo Paone <[email protected]> Co-authored-by: Philip Mueller <[email protected]> Co-authored-by: Enrique González Paredes <[email protected]>
The [initial version](#1594) of the optimization pipeline only contained a rough draft. Currently this PR contains a copy of the map fusion transformations from DaCe that are currently under [review](spcl/dace#1629). As soon as that PR is merged and DaCe was updated in GT4Py these files will be deleted. This PR collects some general improvements: - [x] More liberal `LoopBlocking` transformation (with tests). - [x] Incorporate `MapFusionParallel` - [x] Using of `can_be_applied_to()` as soon as DaCe is updated (`TrivialGPUMapElimination`, `SerialMapPromoter`). - [x] Looking at strides that the Lowering generates. (Partly done) However, it still uses MapFusion implementation that ships with GT4Py and not the one in DaCe. Note: Because of commit 60e4226 this PR must be merged after [PR1768](#1768).
Initial PR for the Optimization pipeline of the DaCe backend in GT4Py.
The PR is very basic, but contains the basis for later advancement.
Essentially, it defines the
gt_auto_optimize()
function which performs optimization on a basic heuristic.There are some parts missing, such as: