Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

why llmc do not support asymmetric quantization? #258

Open
hensiesp32 opened this issue Dec 13, 2024 · 1 comment
Open

why llmc do not support asymmetric quantization? #258

hensiesp32 opened this issue Dec 13, 2024 · 1 comment

Comments

@hensiesp32
Copy link

Hi, thanks for your wonderful work. I wonder why llmc's quantization doesn't support asymmetric quantization if i set save_vllm = True
The config is as below:

base:
    seed: &seed 42
model:
    type: Qwen2
    path: /mnt/public/lingo-engine/model_info/multidoc_masterthemev11_qwen2_7b_bf16_240904
    tokenizer_mode: slow
    torch_dtype: auto
calib:
    name: pileval
    download: False
    path: /mnt/public/lingo-engine/data/pileval_dataset/
    n_samples: 128
    bs: -1
    seq_len: 512
    preproc: general
    seed: *seed
quant:
    method: Awq
    weight:
        bit: 8
        **symmetric: False**
        granularity: per_channel
        group_size: -1
    act:
        bit: 8
        **symmetric: False**
        granularity: per_token
    special:
        trans: True
        trans_version: v2
        weight_clip: True
    quant_out: True
save:
    save_vllm: True
    save_path: /mnt/public/daixin/masterthemev11_qwen2_7b_awq_w8a8_unsym

Then I get error:

[rank0]: Traceback (most recent call last):
[rank0]:   File "/mnt/user/daixin/llmc/llmc/__main__.py", line 317, in <module>
[rank0]:     main(config)
[rank0]:   File "/mnt/user/daixin/llmc/llmc/__main__.py", line 194, in main
[rank0]:     assert w.symmetric, 'Only symmetric quant is supported.'

assert w.symmetric, 'Only symmetric quant is supported.'

@hensiesp32 hensiesp32 changed the title why llmc not support asymmetric quantization? why llmc do not support asymmetric quantization? Dec 13, 2024
@gushiqiao
Copy link
Contributor

This is because the vllm backend only supports symmetric quantized inference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants