Microsoft Contrib Operator ToDO #3538
Replies: 7 comments 1 reply
-
Took a look at the SD 1_5 Olive set of Models Unet, VAE encoder/decoder - GroupNorm is everywhere. Consistently before multiple Convolutions UNet has MultiHeadAttention operator |
Beta Was this translation helpful? Give feedback.
-
Looking at SD 3 Medium model Unet, TextEncoders 1/2/3 - Needs layernormalization - We have this in MIGraphX just needs support in Onnxruntime ROCm/onnxruntime#73 Tokenizer 1/2 - Seeing ClipTokenizer |
Beta Was this translation helpful? Give feedback.
-
SD XL Unet - Needs MultiHeadAttention and GroupNorm |
Beta Was this translation helpful? Give feedback.
-
SD Turbo ControllerNet/Unet/VAE Encode/VAE Decode - Needs GroupNorm |
Beta Was this translation helpful? Give feedback.
-
Flux1, we should be able to run this without any additional modifications of the code (MIGraphx, + MIGraphX EP) |
Beta Was this translation helpful? Give feedback.
-
MatMulIntegerToFloat seems to be useful by Optimized Bert that's int8 quantized by Onnxruntime using DynamicQuantizeLinear + MatMulInteger + cast to do the dequantization. |
Beta Was this translation helpful? Give feedback.
-
Llama V2 requies we also have RotaryEmbedding operator Adding to this list |
Beta Was this translation helpful? Give feedback.
-
As noted in a few meetings now various quantization and optimizations when models are converted to ONNX format incorporate many of the Microsoft contrib operator set.
This is a superset of the ONNX specification: https://github.com/onnx/onnx/blob/main/docs/Operators.md
The full list of these Microsoft Contrib operators are found here: https://github.com/microsoft/onnxruntime/blob/main/docs/ContribOperators.md#com.microsoft.MatMulIntegerToFloat
Eventually the goal is to have similar support as we do with the Onnx spec with the Microsoft operators implimented as part of MIGraphX. Luckily many of these operators are a composite of a subset of MIGraphX ops, or a variant where we can leverage existing parsing, optimizations and functionality to the codebase.
Current operators that have recently surfaced from some of our models and various quantizations we want support for sooner than later.
Please add others as you see fit or you come across these in your runs. I've formatted this as follows to help with planning
Model - Toolchain (Quark/Olive/Onnxruntime) + Model Name - Status (linked to a PR in MIGraphx, or TBD status)
Beta Was this translation helpful? Give feedback.
All reactions