-
This is my scenario. Not clear to me from the documentation if triton might not be suitable for when the batch and length dimensions are changing. I saw some references to the kernels having to be recompiled everytime the sizes change? |
Beta Was this translation helpful? Give feedback.
Answered by
binarman
Mar 28, 2023
Replies: 1 comment 2 replies
-
@RuABraun If your data have finite number of shapes, eventually triton will have all code variants. |
Beta Was this translation helpful? Give feedback.
2 replies
Answer selected by
RuABraun
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
@RuABraun
Kernels are recompiled only once for every new size.
Compiled kernels are stored in cache (see
$HOME/.triton/cache/
)If your data have finite number of shapes, eventually triton will have all code variants.
You can make several "warmup" runs before actual workload to get predictable latencies.