Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make smoothquant more PT2 friendly #1639

Open
vkuzo opened this issue Jan 29, 2025 · 1 comment
Open

make smoothquant more PT2 friendly #1639

vkuzo opened this issue Jan 29, 2025 · 1 comment

Comments

@vkuzo
Copy link
Contributor

vkuzo commented Jan 29, 2025

torchao's smoothquant recently broke after a change to PyTorch core: pytorch/pytorch#145733 . We should make the updates suggested by @anijain2305 in that issue to our code. I actually think we should go a bit farther and go with something like

#
# before
#
class _ActQuantizer:
    def __init__(self, target_dtype, quant_min=-127):
        self.target_dtype = target_dtype
        self.quant_min = quant_min

    def dynamic_quantize(self, input):
        return to_affine_quantized_intx(
            input,
            MappingType.SYMMETRIC,
            _get_per_token_block_size(input),
            self.target_dtype,
            self.quant_min,
        )

    def static_quantize(self, input, scale, zero_point):
        return to_affine_quantized_intx_static(
            input,
            scale,
            zero_point,
            list(input.shape),
            self.target_dtype,
            self.quant_min,
        )

#
# after
#
@dataclass
class _ActQuantConfig:
    target_dtype: torch.dtype
    quant_min: int = -127

# then, logic elsewhere chooses whether to call static or dynamic quant based on the contents of an instance of `_ActQuantConfig`

My feedback here is similar in spirit to #1595 - IMO it's simpler and safer to pass around dumb config objects and use them to choose which function to call, instead of encoding the "which function to call" information in the config as a callable object.

@vkuzo
Copy link
Contributor Author

vkuzo commented Jan 29, 2025

we should also unpin the nightlies (undo most of #1608) when this is fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant