Compute input min/max with a single vectorized pass in DynamicQuantizeLinear #531
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Combine the separate passes over the input to compute the min/max in DynamicQuantizeLinear with a single vectorized pass.
There is a caveat that the new implementation doesn't guarantee the same handling of NANs in the input as before, and this will vary by architecture. The ReduceMin / ReduceMax ops always propagate NANs, whereas this implementation just uses the obvious min/max intrinsic (eg.
_m256_min_ps
) which may do something else.