Skip to content

Commit

Permalink
Complete Transition to Control Flow Regions (#1676)
Browse files Browse the repository at this point in the history
This PR completes the transition to hierarchical control flow regions in
DaCe. By nature of the significance in change that is brought through
the transition to hierarchical control flow regions, this PR is rather
substantial. An exhaustive listing of all adaptations is not feasible,
but the most important changes and adaptations are listed below:

- [x] Change the default of the Python frontend to generate SDFGs using
experimental CFG blocks. A subsequent PR will remove the option to _not_
use experimental CFG blocks entirely, but this was left to a separate PR
to avoid growing this one even more than it already has.
- [x] The option to write a pass or transformation that is _not_
compatible with experimental blocks has been removed, forcing new
transformations and passes to consider them in their design.
- [x] Simplifications to loop related transformations by adapting
explicit loop regions.
- [x] Add a new pass base type, `ControlFlowRegionPass`: This pass works
like `StatePass` or `ScopePass`, and can be extended to write a pass
that applies recursively to each control flow region of an SDFG. An
option can be set to either apply bottom-up or top-down.
- [x] A pass has been added to dead code elimination to prune empty or
falsy conditional branches.
- [x] Include a control flow raising pass in the simplification
pipeline, ensuring that even SDFGs generated without the explicit use of
experimental blocks are raised to the new style SDFGs.
- [x] Adapt all passes and transformations currently in main DaCe to
work with SDFGs containing experimental CFG blocks.
- [x] Almost all transformations and analyses now _expect_ that
experimental blocks are used for regular / reducible control flow,
meaning some control flow analyses have been ditched to improve overall
performance and reliability of DaCe, and remove redundancy.
- [x] Ensure all compiler backends correctly handle experimental blocks.
- [x] Adapt state propagation into a separate pass that has been made to
use experimental blocks. Legacy state propagation has been left in for
now, including tests that ensure it works as intended, to avoid making
this PR even larger. However, it is planned to remove this in a
subsequent PR soon.
- [x] A block fusion pass has been added to the simplification pipeline.
This operates similar to StateFusion, but fuses no-op general control
flow blocks (empty states or control flow blocks) with other control
flow blocks. This reduces the number of nodes and edges in CFGs further.
- [x] Numerous bugfixes with respect to experimental blocks and analyses
around them, thanks to the ability to now run the entire CI pipeline
with them.

Note: The FV3 integration test fails and will continue to fail with this
PR, since GT4Py cartesian, which is used by PyFV3, does not consider
experimental blocks in their design. Since DaCe v1.0.0 will be released
_without_ this PR in it, my suggestion is to limit the application of
the FV3 integration and regression tests to PRs which are made to a
specific v1.0.0 maintenance branch, which is used for fixes to v1.0.0.

---------

Co-authored-by: Philip Mueller <[email protected]>
Co-authored-by: Tal Ben-Nun <[email protected]>
  • Loading branch information
3 people authored Dec 12, 2024
1 parent 4976e16 commit 896a1e1
Show file tree
Hide file tree
Showing 131 changed files with 4,288 additions and 2,745 deletions.
11 changes: 8 additions & 3 deletions .github/workflows/pyFV3-ci.yml
Original file line number Diff line number Diff line change
@@ -1,12 +1,17 @@
name: NASA/NOAA pyFV3 repository build test

# Temporarily disabled for main, and instead applied to a specific DaCe v1 maintenance branch (v1/maintenance). Once
# the FV3 bridge has been adapted to DaCe v1, this will need to be reverted back to apply to main.
on:
push:
branches: [ main, ci-fix ]
#branches: [ main, ci-fix ]
branches: [ v1/maintenance, ci-fix ]
pull_request:
branches: [ main, ci-fix ]
#branches: [ main, ci-fix ]
branches: [ v1/maintenance, ci-fix ]
merge_group:
branches: [ main, ci-fix ]
#branches: [ main, ci-fix ]
branches: [ v1/maintenance, ci-fix ]

defaults:
run:
Expand Down
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -141,6 +141,8 @@ src.VC.VC.opendb

# DaCe
.dacecache/
# Ignore dacecache if added as a symlink
.dacecache
out.sdfg
*.out
results.log
Expand Down
67 changes: 42 additions & 25 deletions dace/codegen/control_flow.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Copyright 2019-2021 ETH Zurich and the DaCe authors. All rights reserved.
# Copyright 2019-2024 ETH Zurich and the DaCe authors. All rights reserved.
"""
Various classes to facilitate the code generation of structured control
flow elements (e.g., ``for``, ``if``, ``while``) from state machines in SDFGs.
Expand Down Expand Up @@ -62,8 +62,8 @@
import sympy as sp
from dace import dtypes
from dace.sdfg.analysis import cfg as cfg_analysis
from dace.sdfg.state import (BreakBlock, ConditionalBlock, ContinueBlock, ControlFlowBlock, ControlFlowRegion, LoopRegion,
ReturnBlock, SDFGState)
from dace.sdfg.state import (BreakBlock, ConditionalBlock, ContinueBlock, ControlFlowBlock, ControlFlowRegion,
LoopRegion, ReturnBlock, SDFGState)
from dace.sdfg.sdfg import SDFG, InterstateEdge
from dace.sdfg.graph import Edge
from dace.properties import CodeBlock
Expand Down Expand Up @@ -200,7 +200,10 @@ class BreakCFBlock(ControlFlow):
block: BreakBlock

def as_cpp(self, codegen, symbols) -> str:
return 'break;\n'
cfg = self.block.parent_graph
expr = '__state_{}_{}:;\n'.format(cfg.cfg_id, self.block.label)
expr += 'break;\n'
return expr

@property
def first_block(self) -> BreakBlock:
Expand All @@ -214,7 +217,10 @@ class ContinueCFBlock(ControlFlow):
block: ContinueBlock

def as_cpp(self, codegen, symbols) -> str:
return 'continue;\n'
cfg = self.block.parent_graph
expr = '__state_{}_{}:;\n'.format(cfg.cfg_id, self.block.label)
expr += 'continue;\n'
return expr

@property
def first_block(self) -> ContinueBlock:
Expand All @@ -228,7 +234,10 @@ class ReturnCFBlock(ControlFlow):
block: ReturnBlock

def as_cpp(self, codegen, symbols) -> str:
return 'return;\n'
cfg = self.block.parent_graph
expr = '__state_{}_{}:;\n'.format(cfg.cfg_id, self.block.label)
expr += 'return;\n'
return expr

@property
def first_block(self) -> ReturnBlock:
Expand Down Expand Up @@ -316,7 +325,13 @@ def as_cpp(self, codegen, symbols) -> str:
# One unconditional edge
if (len(out_edges) == 1 and out_edges[0].data.is_unconditional()):
continue
expr += f'goto __state_exit_{sdfg.cfg_id};\n'
if self.region:
expr += f'goto __state_exit_{self.region.cfg_id};\n'
else:
expr += f'goto __state_exit_{sdfg.cfg_id};\n'

if self.region and not isinstance(self.region, SDFG):
expr += f'__state_exit_{self.region.cfg_id}:;\n'

return expr

Expand Down Expand Up @@ -536,10 +551,14 @@ def as_cpp(self, codegen, symbols) -> str:
expr = ''

if self.loop.update_statement and self.loop.init_statement and self.loop.loop_variable:
init = unparse_interstate_edge(self.loop.init_statement.code[0], sdfg, codegen=codegen, symbols=symbols)
lsyms = {}
lsyms.update(symbols)
if codegen.dispatcher.defined_vars.has(self.loop.loop_variable) and not self.loop.loop_variable in lsyms:
lsyms[self.loop.loop_variable] = codegen.dispatcher.defined_vars.get(self.loop.loop_variable)[1]
init = unparse_interstate_edge(self.loop.init_statement.code[0], sdfg, codegen=codegen, symbols=lsyms)
init = init.strip(';')

update = unparse_interstate_edge(self.loop.update_statement.code[0], sdfg, codegen=codegen, symbols=symbols)
update = unparse_interstate_edge(self.loop.update_statement.code[0], sdfg, codegen=codegen, symbols=lsyms)
update = update.strip(';')

if self.loop.inverted:
Expand Down Expand Up @@ -571,6 +590,8 @@ def as_cpp(self, codegen, symbols) -> str:
expr += _clean_loop_body(self.body.as_cpp(codegen, symbols))
expr += '\n}\n'

expr += f'__state_exit_{self.loop.cfg_id}:;\n'

return expr

@property
Expand Down Expand Up @@ -1018,21 +1039,16 @@ def _structured_control_flow_traversal_with_regions(cfg: ControlFlowRegion,
start: Optional[ControlFlowBlock] = None,
stop: Optional[ControlFlowBlock] = None,
generate_children_of: Optional[ControlFlowBlock] = None,
branch_merges: Optional[Dict[ControlFlowBlock,
ControlFlowBlock]] = None,
ptree: Optional[Dict[ControlFlowBlock, ControlFlowBlock]] = None,
visited: Optional[Set[ControlFlowBlock]] = None):
if branch_merges is None:
branch_merges = cfg_analysis.branch_merges(cfg)

if ptree is None:
ptree = cfg_analysis.block_parent_tree(cfg, with_loops=False)

start = start if start is not None else cfg.start_block

def make_empty_block():
def make_empty_block(region):
return GeneralBlock(dispatch_state, parent_block,
last_block=False, region=None, elements=[], gotos_to_ignore=[],
last_block=False, region=region, elements=[], gotos_to_ignore=[],
gotos_to_break=[], gotos_to_continue=[], assignments_to_ignore=[], sequential=True)

# Traverse states in custom order
Expand All @@ -1059,18 +1075,18 @@ def make_empty_block():
cfg_block = GeneralConditionalScope(dispatch_state, parent_block, False, node, [])
for cond, branch in node.branches:
if branch is not None:
body = make_empty_block()
body = make_empty_block(branch)
body.parent = cfg_block
_structured_control_flow_traversal_with_regions(branch, dispatch_state, body)
cfg_block.branch_bodies.append((cond, body))
elif isinstance(node, ControlFlowRegion):
if isinstance(node, LoopRegion):
body = make_empty_block()
body = make_empty_block(node)
cfg_block = GeneralLoopScope(dispatch_state, parent_block, False, node, body)
body.parent = cfg_block
_structured_control_flow_traversal_with_regions(node, dispatch_state, body)
else:
cfg_block = make_empty_block()
cfg_block = make_empty_block(node)
cfg_block.region = node
_structured_control_flow_traversal_with_regions(node, dispatch_state, cfg_block)

Expand All @@ -1095,13 +1111,14 @@ def make_empty_block():
return visited - {stop}


def structured_control_flow_tree_with_regions(sdfg: SDFG, dispatch_state: Callable[[SDFGState], str]) -> ControlFlow:
def structured_control_flow_tree_with_regions(cfg: ControlFlowRegion,
dispatch_state: Callable[[SDFGState], str]) -> ControlFlow:
"""
Returns a structured control-flow tree (i.e., with constructs such as branches and loops) from an SDFG based on the
Returns a structured control-flow tree (i.e., with constructs such as branches and loops) from a CFG based on the
control flow regions it contains.
:param sdfg: The SDFG to iterate over.
:return: Control-flow block representing the entire SDFG.
:param cfg: The graph to iterate over.
:return: Control-flow block representing the entire graph.
"""
root_block = GeneralBlock(dispatch_state=dispatch_state,
parent=None,
Expand All @@ -1113,7 +1130,7 @@ def structured_control_flow_tree_with_regions(sdfg: SDFG, dispatch_state: Callab
gotos_to_break=[],
assignments_to_ignore=[],
sequential=True)
_structured_control_flow_traversal_with_regions(sdfg, dispatch_state, root_block)
_structured_control_flow_traversal_with_regions(cfg, dispatch_state, root_block)
_reset_block_parents(root_block)
return root_block

Expand All @@ -1127,7 +1144,7 @@ def structured_control_flow_tree(sdfg: SDFG, dispatch_state: Callable[[SDFGState
:param sdfg: The SDFG to iterate over.
:return: Control-flow block representing the entire SDFG.
"""
if sdfg.root_sdfg.using_experimental_blocks:
if sdfg.root_sdfg.using_explicit_control_flow:
return structured_control_flow_tree_with_regions(sdfg, dispatch_state)

# Avoid import loops
Expand Down
66 changes: 34 additions & 32 deletions dace/codegen/instrumentation/data/data_dump.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
# Copyright 2019-2022 ETH Zurich and the DaCe authors. All rights reserved.
from dace import config, data as dt, dtypes, registry, SDFG
# Copyright 2019-2024 ETH Zurich and the DaCe authors. All rights reserved.
from dace import data as dt, dtypes, registry, SDFG
from dace.sdfg import nodes, is_devicelevel_gpu
from dace.codegen.prettycode import CodeIOStream
from dace.codegen.instrumentation.provider import InstrumentationProvider
from dace.sdfg.scope import is_devicelevel_fpga
from dace.sdfg.state import SDFGState
from dace.sdfg.state import ControlFlowRegion, SDFGState
from dace.codegen import common
from dace.codegen import cppunparse
from dace.codegen.targets import cpp
Expand Down Expand Up @@ -101,7 +101,8 @@ def on_sdfg_end(self, sdfg: SDFG, local_stream: CodeIOStream, global_stream: Cod
if sdfg.parent is None:
sdfg.append_exit_code('delete __state->serializer;\n')

def on_state_begin(self, sdfg: SDFG, state: SDFGState, local_stream: CodeIOStream, global_stream: CodeIOStream):
def on_state_begin(self, sdfg: SDFG, cfg: ControlFlowRegion, state: SDFGState, local_stream: CodeIOStream,
global_stream: CodeIOStream):
if state.symbol_instrument == dtypes.DataInstrumentationType.No_Instrumentation:
return

Expand All @@ -119,17 +120,17 @@ def on_state_begin(self, sdfg: SDFG, state: SDFGState, local_stream: CodeIOStrea
condition_preamble = f'if ({cond_string})' + ' {'
condition_postamble = '}'

state_id = sdfg.node_id(state)
local_stream.write(condition_preamble, sdfg, state_id)
state_id = cfg.node_id(state)
local_stream.write(condition_preamble, cfg, state_id)
defined_symbols = state.defined_symbols()
for sym, _ in defined_symbols.items():
local_stream.write(
f'__state->serializer->save_symbol("{sym}", "{state_id}", {cpp.sym2cpp(sym)});\n', sdfg, state_id
f'__state->serializer->save_symbol("{sym}", "{state_id}", {cpp.sym2cpp(sym)});\n', cfg, state_id
)
local_stream.write(condition_postamble, sdfg, state_id)
local_stream.write(condition_postamble, cfg, state_id)

def on_node_end(self, sdfg: SDFG, state: SDFGState, node: nodes.AccessNode, outer_stream: CodeIOStream,
inner_stream: CodeIOStream, global_stream: CodeIOStream):
def on_node_end(self, sdfg: SDFG, cfg: ControlFlowRegion, state: SDFGState, node: nodes.AccessNode,
outer_stream: CodeIOStream, inner_stream: CodeIOStream, global_stream: CodeIOStream):
from dace.codegen.dispatcher import DefinedType # Avoid import loop

if is_devicelevel_gpu(sdfg, state, node) or is_devicelevel_fpga(sdfg, state, node):
Expand Down Expand Up @@ -159,9 +160,9 @@ def on_node_end(self, sdfg: SDFG, state: SDFGState, node: nodes.AccessNode, oute
ptrname = '&' + ptrname

# Create UUID
state_id = sdfg.node_id(state)
state_id = cfg.node_id(state)
node_id = state.node_id(node)
uuid = f'{sdfg.cfg_id}_{state_id}_{node_id}'
uuid = f'{cfg.cfg_id}_{state_id}_{node_id}'

# Get optional pre/postamble for instrumenting device data
preamble, postamble = '', ''
Expand All @@ -174,13 +175,13 @@ def on_node_end(self, sdfg: SDFG, state: SDFGState, node: nodes.AccessNode, oute
strides = ', '.join(cpp.sym2cpp(s) for s in desc.strides)

# Write code
inner_stream.write(condition_preamble, sdfg, state_id, node_id)
inner_stream.write(preamble, sdfg, state_id, node_id)
inner_stream.write(condition_preamble, cfg, state_id, node_id)
inner_stream.write(preamble, cfg, state_id, node_id)
inner_stream.write(
f'__state->serializer->save({ptrname}, {cpp.sym2cpp(desc.total_size - desc.start_offset)}, '
f'"{node.data}", "{uuid}", {shape}, {strides});\n', sdfg, state_id, node_id)
inner_stream.write(postamble, sdfg, state_id, node_id)
inner_stream.write(condition_postamble, sdfg, state_id, node_id)
f'"{node.data}", "{uuid}", {shape}, {strides});\n', cfg, state_id, node_id)
inner_stream.write(postamble, cfg, state_id, node_id)
inner_stream.write(condition_postamble, cfg, state_id, node_id)


@registry.autoregister_params(type=dtypes.DataInstrumentationType.Restore)
Expand Down Expand Up @@ -216,7 +217,8 @@ def on_sdfg_end(self, sdfg: SDFG, local_stream: CodeIOStream, global_stream: Cod
if sdfg.parent is None:
sdfg.append_exit_code('delete __state->serializer;\n')

def on_state_begin(self, sdfg: SDFG, state: SDFGState, local_stream: CodeIOStream, global_stream: CodeIOStream):
def on_state_begin(self, sdfg: SDFG, cfg: ControlFlowRegion, state: SDFGState, local_stream: CodeIOStream,
global_stream: CodeIOStream):
if state.symbol_instrument == dtypes.DataInstrumentationType.No_Instrumentation:
return

Expand All @@ -234,18 +236,18 @@ def on_state_begin(self, sdfg: SDFG, state: SDFGState, local_stream: CodeIOStrea
condition_preamble = f'if ({cond_string})' + ' {'
condition_postamble = '}'

state_id = sdfg.node_id(state)
local_stream.write(condition_preamble, sdfg, state_id)
state_id = state.block_id
local_stream.write(condition_preamble, cfg, state_id)
defined_symbols = state.defined_symbols()
for sym, sym_type in defined_symbols.items():
local_stream.write(
f'{cpp.sym2cpp(sym)} = __state->serializer->restore_symbol<{sym_type.ctype}>("{sym}", "{state_id}");\n',
sdfg, state_id
cfg, state_id
)
local_stream.write(condition_postamble, sdfg, state_id)
local_stream.write(condition_postamble, cfg, state_id)

def on_node_begin(self, sdfg: SDFG, state: SDFGState, node: nodes.AccessNode, outer_stream: CodeIOStream,
inner_stream: CodeIOStream, global_stream: CodeIOStream):
def on_node_begin(self, sdfg: SDFG, cfg: ControlFlowRegion, state: SDFGState, node: nodes.AccessNode,
outer_stream: CodeIOStream, inner_stream: CodeIOStream, global_stream: CodeIOStream):
from dace.codegen.dispatcher import DefinedType # Avoid import loop

if is_devicelevel_gpu(sdfg, state, node) or is_devicelevel_fpga(sdfg, state, node):
Expand Down Expand Up @@ -275,21 +277,21 @@ def on_node_begin(self, sdfg: SDFG, state: SDFGState, node: nodes.AccessNode, ou
ptrname = '&' + ptrname

# Create UUID
state_id = sdfg.node_id(state)
state_id = cfg.node_id(state)
node_id = state.node_id(node)
uuid = f'{sdfg.cfg_id}_{state_id}_{node_id}'
uuid = f'{cfg.cfg_id}_{state_id}_{node_id}'

# Get optional pre/postamble for instrumenting device data
preamble, postamble = '', ''
if desc.storage == dtypes.StorageType.GPU_Global:
self._setup_gpu_runtime(sdfg, global_stream)
self._setup_gpu_runtime(cfg, global_stream)
preamble, postamble, ptrname = self._generate_copy_to_device(node, desc, ptrname)

# Write code
inner_stream.write(condition_preamble, sdfg, state_id, node_id)
inner_stream.write(preamble, sdfg, state_id, node_id)
inner_stream.write(condition_preamble, cfg, state_id, node_id)
inner_stream.write(preamble, cfg, state_id, node_id)
inner_stream.write(
f'__state->serializer->restore({ptrname}, {cpp.sym2cpp(desc.total_size - desc.start_offset)}, '
f'"{node.data}", "{uuid}");\n', sdfg, state_id, node_id)
inner_stream.write(postamble, sdfg, state_id, node_id)
inner_stream.write(condition_postamble, sdfg, state_id, node_id)
f'"{node.data}", "{uuid}");\n', cfg, state_id, node_id)
inner_stream.write(postamble, cfg, state_id, node_id)
inner_stream.write(condition_postamble, cfg, state_id, node_id)
Loading

0 comments on commit 896a1e1

Please sign in to comment.