-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SM68] Initial proposal for Wave Matrix #61
base: main
Are you sure you want to change the base?
Conversation
This proposal adds new WaveMatrix data types that support lane cooperative matrix multiplication. A preview implementation of this proposal is available as part of the DXC 1.8.2306 preview release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the slew of comments 😕
proposals/xxxx-wave-matrix.md
Outdated
|
||
These higher throughput matrix operations are required for optimal performance | ||
of many machine learning and image processing workloads. Adding native support | ||
to HLSL will enable high-performance matrix operations across all supported |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's minor, but I fear the notion of "supported" is vague here. It may be interpreted as all hardware that supports SM 6.8, which is probably not the case. My feeble attempt to recraft the sentence:
Adding support to HLSL will enable the high-performance of native matrix operations across all hardware with such support through Shader Model 6.8 drivers.
proposals/xxxx-wave-matrix.md
Outdated
WaveMatrix introduces new matrix templates to facilitate wave cooperative | ||
operations: | ||
|
||
```c++ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't seem to make a lick of difference in this case, but ```hlsl is an option in github:
WaveMatrixLeft <TYPE_IN, M, N> ; // M x K
WaveMatrixLeft <TYPE_IN, M, N> ; // M x K
Maybe someday we'll get proper syntax highlighting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI if you check the 3rd party grammars supported by Github's Linguist integration here, HLSL is listed as being supported by Tim's textmate grammar (so hlsl
is a valid fenced code block and we just need to update that grammar to update it as the language changes). I would actually suggest having an "official" Microsoft-maintained grammar file that you can update as the compiler updates if it isn't too much trouble. It'd be super helpful for HLSL tool maintainers to have a more "official" syntax highlighter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this code block the HLSL highlighting is probably okay, but GitHub’s HLSL syntax highlighting doesn’t handle HLSL 2021 particularly well. The C++ mode syntax highlighting does much better IMO.
As an example here’s HLSL highlighted:
namespace detail {
template <typename ElTy, int NRows, int NCols>
class WaveMatrixBase {
};
} // namespace detail
And the same code C++ highlighted:
namespace detail {
template <typename ElTy, int NRows, int NCols>
class WaveMatrixBase {
};
} // namespace detail
I’m fine changing the simpler code samples to be HLSL highlighted, but I’d greatly prefer to keep the ones that have templates C++-mode. I had used C++ everywhere for consistency, but that isn’t strictly necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I think the thing to do is to fork the C++ tmLanguage file as a base, adding HLSL specific constructs/keywords, and then swap it as the official highlighting grammar for github (instructions for doing that are here. It'd be a shame for this to be fixed later and then have a bunch of HLSL snippets throughout the github ecosystem be tagged as C++ snippets into perpetuity I think!
proposals/xxxx-wave-matrix.md
Outdated
All WaveMatrix objects have a `Fill` method of the form `void Fill(ElTy Value)` | ||
where `ElTy` is the element type. | ||
|
||
The `Fill` method fills the matrix or matrix fragment with the provided value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a more technical description would be "Assigns the given Value
to every element in the matrix or matrix fragment".
proposals/xxxx-wave-matrix.md
Outdated
All wave threads must provide the same value or the result is undefined. All | ||
WaveMatrix objects have the same `Fill` method with the same behavior. | ||
|
||
### WaveMatrix Matrix Objects |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like the explanation of Fill
above would be a bit more logical after this section
proposals/xxxx-wave-matrix.md
Outdated
### WaveMatrix Matrix Objects | ||
|
||
The code below approximately declares the base interface that WaveMatrix matrix | ||
objects implement. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add "the following sections will explain in detail what these methods do and what the parameters represent."
proposals/xxxx-wave-matrix.md
Outdated
The `WaveMatrixAccumulator::MultiplyAccumulate` method performs multiplication | ||
of the left and right arguments and adds the result back into the | ||
`WaveMatrixAccumulator`. This is a wave-level operation and cannot be used | ||
inside divergent control flow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as above, what if I do?
proposals/xxxx-wave-matrix.md
Outdated
WaveMatrix intrinsics are defined to support quantization calculations. | ||
Including calculating a sum for the rows of the left matrix and a sum of the | ||
columns of the right matrix. The `WaveMatrixRightRowAcc` and | ||
`WaveMatrixLeftColAcc` fragment accumulators perform this operation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should they have "Fragment" in the name? I was initially confused why they took full matrices as parameters, but I see through the inheritance that they are fragment accumulators
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't love the names of those types, but I think if we add "Fragment" to the name we're going to be getting dangerously close to a type name that can't fit on one line of code with reasonable line wrapping rules.
proposals/xxxx-wave-matrix.md
Outdated
#### Zero Point | ||
|
||
The following is the equation for matrix multiplication with zero point | ||
adjustment included: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section feels out of place in between the definition of WaveMatrix Fragment Acuumulators and the description of their SumAccumulate method. I recognize that it is referenced in the section just after it, but if this is defining the multiply operations, perhaps it can go after the multiply method description and the next section can link back to it?
proposals/xxxx-wave-matrix.md
Outdated
|
||
$Z_*$ are constant zero points values | ||
|
||
#### Wave Matrix SumAccumulate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe "WaveMatrix Fragment SumAccumulate" to be consistent with the title of that section?
proposals/xxxx-wave-matrix.md
Outdated
#### Wave Matrix SumAccumulate | ||
|
||
The `SumAccumulate` methods accumulate the values of the argument matrix into | ||
the WaveMatrix fragment accumulator. The fragment WaveMatrix must have the same |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading this and the description above the class, it's not clear to me what actually gets placed into the accumulator. Is it the sum of all elements in each row of a row-major matrix and the sum of all elements in each column of a column-major matrix?
microsoft/DirectXShaderCompiler#6807 I assume that this proposal has been denied ? |
Denied is probably the wrong phrasing... In need of significant reworking. This is definitely a feature we want, but the design proposed here has some pretty big limitations that we haven't had time to address. That resulted in us pausing development on it and not shipping it with the Shader Model 6.8 release. We hope to refine this feature (in public) and include an updated version of it in the future. |
This proposal adds new WaveMatrix data types that support lane cooperative matrix multiplication.
A preview implementation of this proposal is available as part of the DXC 1.8.2306 preview release.