Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Roadmap] TVMScript Frontend #471

Open
7 tasks
Hzfengsy opened this issue Sep 10, 2021 · 12 comments
Open
7 tasks

[Roadmap] TVMScript Frontend #471

Hzfengsy opened this issue Sep 10, 2021 · 12 comments
Labels

Comments

@Hzfengsy
Copy link
Member

Hzfengsy commented Sep 10, 2021

  1. Support all kinds of nodes
    • WhileNode
    • BufferRealizeNode
    • ProducerLoadNode
    • ProducerStoreNode
    • ProducerRealizeNode
    • BlockNode (without BlockRealize)
    • AnyNode
  2. Support fragment printing
@junrushao
Copy link
Member

junrushao commented Sep 10, 2021

Namespace and Tooling-Friendiness

This subsection is based on @yzh119's proposal #420 #426.

Pain points

  • P1. No python auto-completion support
  • P2. Usually conflicts with pylint
  • P3. APIs scatter in namespaces like tvm.script, tvm.tir, tvm.script.ty
  • P4. Somewhat non-trivial to understand at first glance what the decorator generates

Here is an example of how my pylint complaints about things above:
image

Proposal

  • A1. Use tvm.script as the “root” namespace for all TVM script related stuff
  • A2. Use tvm.script.tir for TIR, and idiomatically import it as T, like Keras is usually imported as K
  • A3. Use tvm.script.relax for Relax, and idiomatically import it as R
  • A4. To be consistent with the names of their resulting types, use
    • tvm.script.IRModule for IRModule
    • T.PrimFunc for tir.PrimFunc
    • R.Function for relax.Function

With the proposal above, we are able to provide type stubs that provides users with TVM scripts that work well with linting and auto-completion.

Here is an example of the proposed syntax:

from tvm.script import tir as T                                
# ^ there is a broadly accepted precedence in doing this in the python community: from keras import backend as K

@tvm.script.IRModule                                                   # so it generates an IRModule
class Module:
  @T.PrimFunc                                                          # it generates a PrimFunc
  def func(a: T.handle, b: T.handle, C: T.handle) -> None:
    A = T.match_buffer(a, [128, 128], dtype="float32")                 # stub provided for tvm.script.tir.match_buffer
    B = T.match_buffer(b, [128, 128], dtype="float32")
    C = T.match_buffer(c, [128, 128], dtype="float32")
    with T.block([128, 128, T.reduce_axis(0, 128)], "C") as [i, j, k]: # stub provided for tvm.script.tir.block
        C[i, j] = T.if_then_else(                                      # stub provided for tvm.script.tir.if_then_else
            i == 0 and j == 0 and k == 0,
            0.0,
            C[i, j] + A[i, k] * B[k, j],
            dtype="float32",
        )

>>> print(type(Module))
<class 'tvm.ir.module.IRModule'>

>>> print(type(Module["func"]))
<class 'tvm.tir.function.PrimFunc'>

@junrushao
Copy link
Member

junrushao commented Sep 10, 2021

Block and block bindings: Proposal B0

Pain points

  • B1. Trivial bindings
  • B2. Block's iter domain duplicates with outer loops' loop domain
  • B3. Auto-complete is "too" automatic

Proposal

Here is the philosophy behind the proposed design

  • F1. Focus on the concept "iteration domain" of a block
  • F2. Minimize repetitive declaration for any trivial bindings
  • F3. Reduce the line width to improve readability

G1. The complete form

for i, j, k in T.grid(512, 512, 512):
  with T.block("C", iter_dom_ndim=3) as [vi, vj, vk]:
    T.iter_dom_dim(var=vi, type='S', dom=512, bind=i)
    T.iter_dom_dim(var=vj, type='S', dom=512, bind=j)
    T.iter_dom_dim(var=vk, type='R', dom=512, bind=k)
    T.reads(...)
    T.writes(...)

G2. With full trivial bindings

for i, j, k in T.grid(512, 512, 512):
  with T.block("C", iter_dom_ndim=3, trivial_bind="SSR") as [i, j, k]: # <= redefinition treated as binding
    T.reads(...)
    T.writes(...)

G3. With partial trivial bindings

for i, j, k in T.grid(512, 512, 512):
  with T.block("C", iter_dom_ndim=3, trivial_bind=".SR") as [ki, j, k]:
    T.iter_dom_dim(var=vi, type='S', dom=512, bind=i)
    T.reads(...)
    T.writes(...)

G4. No automatic loop induction

Generating loops on top of blocks looks a bit weird in terms of semantics, even though totally conveyable with extra documentation. With our binding design, we don't actually need this powerful tool.

@junrushao
Copy link
Member

#410

@tqchen
Copy link
Contributor

tqchen commented Sep 11, 2021

It would be great to discuss a few candidates of blocks and block bindings. I labeled @junrushao1994 's proposal as B0, let us also list the current definition and new proposals, so we can have a clear set of basis for discussion.

@tqchen
Copy link
Contributor

tqchen commented Sep 11, 2021

Block and block bindings: Proposal B1

Note that this form discards the desire of putting iterators on the block, but instead focuses
on getting some information right in the block body.

Complete Form

for i, j, k in T.grid(512, 512, 512):
  with T.block("C"):
    # the API name can subject to change
    vi = T.axis.S(512, i)
    vj = T.axis.S(512, j)
    vk = T.axis.R(512, k)
    T.reads(...)
    T.writes(...)

Note that API name can change

  • B1a: do not mark axis in the block since we are breaking the assumption that with block is relatively self contained.
  • B1b: Use block_var = match_axis_pattern(domain, value) to represent the value mapping, this is consistent with our use of match_buffer
  • B1c: The name naming of the match_axis_pattern can subject to change, there are a few choices here:
    • Simply encode type as function name
    • Encode type into keyword arguments
    • Use a namespace to emphasize the kind (T.axis)

Allow Autobinding some vars

for i, j, k in T.grid(512, 512, 512):
  with T.block("C", map_axis=[i, j, k]):
       C[i, j] += A[i, k] * B[j, k]

Key design pts:

  • B1d: The block contains a list of iterators that are passed as block_vars, they can be directly used in the body.
  • Naming can subject to change:
    • map_axis: inspired from memmap
    • auto_bind: automatically bind iterators

Another alternative(add mapping property declarations )

for i, j, k in T.grid(512, 512, 512):
  with T.block("C", map_spatial_axis=[i, j], map_reduce_axis=[k]):
       C[i, j] += A[i, k] * B[j, k]

Note on advanced constraints

As we extent to future iteration patterns, we might want to introduce additional constraints, where
the iterator may no longer fit be declared separately. As a mock up example, we might introduce a concept of axis group to declare the non-trivial interactive relation among three axis, and they need to be declared together. We need to think about our convention to extent to this case

for i, j, k in T.grid(512, 512, 512):
  with T.block("C"):
    vi, vj, vk = T.sparse.axis_group([512, 512, 512], "Dense,Sparse,Dense"
        [value0, value1, value2]
     )
    T.reads(...)
    T.writes(...)

@tqchen
Copy link
Contributor

tqchen commented Sep 11, 2021

Block and block bindings: Proposal B2

This is the current form

Complete Form

for i, j, k in T.grid(512, 512, 512):
  with T.block("C", [512, 512, T.reduce_axis(512)]) as vi, vj, vk:
    # the API name can subject to change
    T.bind(vi,  i)
    T.bind(vj,  j)
    T.bind(vk,  k)
    T.reads(...)
    T.writes(...)

Autobinding iis implicit

  with T.block([512, 512, T.reduce_axis(512)], ) as vi, vj, vk:
       C[i, j] += A[i, k] * B[j, k]

@Hzfengsy
Copy link
Member Author

Thanks for the great discussion and proposals. Here are two major points from my opinion.

  1. Let users know there are block vars and bindings
  2. It would be great if there are few lines since one block may have more than 5 block vars in conv2d workload.

Block and block bindings: Proposal B3

Complete Form

for i, j, k in T.grid(512, 512, 512):
    with T.block("C"):
        vi = T.axis.S(i, 512)
        vj = T.axis.S(j, 512)
        vk = T.axis.R(k, 512)
        T.reads(...)
        T.writes(...)

A Sugar for Complete Form

for i, j, k in T.grid(512, 512, 512):
    with T.block("C"):
        vi, vj, vk = T.iter([i, j, k], "SSR")
        T.reads(...)
        T.writes(...)

Auto binding

No needed in this format

@junrushao
Copy link
Member

junrushao commented Sep 13, 2021

Thanks @tqchen and @Hzfengsy for the proposals!

First of all, we seem to converge to a point where we don't want the with statement to contain all the block information, which can be overwhelming to certain extent: imagine a conv2d with 3 spatial axes and 4 reduction axes, which is unrealistic to put them on a single line without raising confusion.

Block binding

On the syntax of a block binding, I listed the proposal B0, B1 and B3 below for detailed comparison:

# Syntax in B0
T.iter_dom_dim(var=vi, type='S', dom=512, bind=i)
# Syntax in B1
vi = T.axis.S(i, 512)
# Syntax in B3
vi = T.axis.S(512, i)

Both B1 and B3 treats bindings as assignments, which hmmm from my PoV is not a big problem, and looks cleaner (PL guys might disagree). Also, both B1 and B3 seem to use standalone scoping for these bindings, which I feel is better than B0.

The difference between B1 and B3 is order of arguments, which I would prefer B3, which is easier for users to write fragment where a Block can exist without BlockRealize.

One thing I am not so sure about is naming. As @Hzfengsy said, we would love to the syntax itself to convey the design philosophy (Let users know there are block vars and bindings), so I feel strongly that we should emphasize the concept "block domain", or "iteration domain of the block". Therefore we should love to propose the following:

# B4. The new proposal
vi = T.block_domain.S(domain=512, bind=i)

# In the doc, which pops up almost instantly in users' vscode/vim/other IDEs
# we can say this is shortcut for `T.block_domain.spatial_axis`

Auto-binding for Trivial Bindings

Looks like we have 3 different proposals here:

# Syntax in B0
for i, j, k in T.grid(512, 512, 512):
  with T.block("C", iter_dom_ndim=3, trivial_bind=".SR") as [ki, j, k]:  # <= redefinition treated as binding
    T.iter_dom_dim(var=vi, type='S', dom=512, bind=i)
    T.reads(...)
    T.writes(...)

# Syntax in B1
for i, j, k in T.grid(512, 512, 512):
  with T.block("C", map_axis=[i, j, k]):
       C[i, j] += A[i, k] * B[j, k]

# Syntax in B3
for i, j, k in T.grid(512, 512, 512):
    with T.block("C"):
        vi, vj, vk = T.iter([i, j, k], "SSR")
        T.reads(...)
        T.writes(...)

Below are my understanding:

  • The redefinition-as-trivial-binding semantics on B0 is admittedly sort of confusing and unpythonic;
  • B1 seems to introduce some interesting semantics which takes me quite a while to understand (specifically "map" in a "block");
  • B3 is the most natural way from my PoV which doesn't deviate from our previous binding definition.

Therefore, I would love to go with B3, with some minor naming stuff to make sure our definition is always focused on one and only one concept - "block domain". Here is my new proposal that focuses B3 on "block domain" as well as generalize the proposal a little bit:

# B4. The new proposal
for i, j, k in T.grid(512, 512, 512):
    with T.block("C"):
        vi, vj, vk = T.block_domain.many("SSR", [i, j, k])
        T.reads(...)
        T.writes(...)

for i, j, k in T.grid(512, 512, 512):
    with T.block("C"):
        vi, vj = T.block_domain.many(types="SS", binds=[i, j])
        vk = T.block_domain.many(types="R", binds=k + 1)  # <= can write arbitrary expression in binds
        T.reads(...)
        T.writes(...)

@tqchen
Copy link
Contributor

tqchen commented Sep 13, 2021

a bit more about naming. We do need to convey the concept axis or iter var in someway.

To explain one possible confusion here.

block_domain.S can be interpreted as one kind of ”domain”, and there are many block domains in a block. While what we really want to say is one iterator in the domain, and all of the iterators form a domain.

Another possible way to highlight block could be(although I am not attached to it)

  • with block() as b: vi = b.axis.S

Refer to Block name explicitly: Proposal B5

for i, j, k in T.grid(512, 512, 512):
    # block is named as blockC
    with T.block() as blockC:
        vi = blockC.axis.S(512, i)
        vj = blockC.axis.S(512, j)
        vk = blockC.axis.R(512, k)
        blockC.reads(...)
        blockC.writes(...)

for i, j, k in T.grid(512, 512, 512):
    with T.block() as blockC:
        vi, vj, vk = blockC.axis.reuse("SSR", [i, j, k])
        blockC.reads(...)
        blockC.writes(...)

One potential drawback here is that the block name can be confused with the buffer name(if you directly want to name block as C)

@junrushao
Copy link
Member

junrushao commented Sep 13, 2021

The new with statement looks pretty good to me, thanks for this proposal!

On the naming: what about using “blockC.domain_axis.S” instead of “blockC.axis.S”? Because a block doesn’t have axes, but its iteration domain does

@junrushao
Copy link
Member

CC: @zxybazh @shingjan

@tqchen
Copy link
Contributor

tqchen commented Sep 14, 2021

The main limitation of B5 is that block name can longer be same with the buffer name(which can be a common requirement), Considering this fact we might still want to bring back the old style but keep name block_axis .

for i, j, k in T.grid(512, 512, 512):
    # block is named as blockC
    with T.block("C"):
        vi = T.block_axis.S(512, i)
        vj = T.block_axis.S(512, j)
        vk = T.block_axis.R(512, k)
        T.reads(...)
        T.writes(...)

for i, j, k in T.grid(512, 512, 512):
    with T.block():
        vi, vj, vk = T.block_axis.reuse("SSR", [i, j, k])
        T.reads(...)
        T.writes(...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants