Skip to content

Commit

Permalink
fix typos
Browse files Browse the repository at this point in the history
  • Loading branch information
wimh966 committed Oct 21, 2023
1 parent 4876c3a commit 981fb7a
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 5 deletions.
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,15 +4,15 @@ Official PyTorch implementation of <a href="https://arxiv.org/abs/2304.09145">O

## Overview

The Outlier Suppression+ (OS+) effectively suppresses outliers in large language models for better quantization performance without extra inference burden. It first identifies the outlier asymmetric shape across channels and proposes a channel-wise shifting technique with a migration pattern to eliminate it. It then focuses on the outlier concentration phenomenon and proposes to scale down outlier channels toward a dedicated objective.
The Outlier Suppression+ (OS+) effectively suppresses outliers in large language models for better quantization performance without extra inference burden. It first identifies the outlier asymmetric shape across channels and proposes a channel-wise shifting technique with a migration pattern to eliminate it. It then focuses on the outlier concentration phenomenon and proposes to scale down outlier channels toward an elaborate objective.

<p align="center">
<img src="figure/outlier_suppression_plus.png">
</p>

We assess the efficacy of our approach under both standard and fine-grained quantization settings. On standard one, OS+ achieves near-floating-point performance on 8-bit and 6-bit BERT, OPTs, BLOOM, and BLOOMZ. On fine-grained one, OS+ can surpass others by 9.41\% on 4-bit LLaMA with per-token quantization and obtain lossless results on 4-bit OPT with per-group quantization.

In the following sections, [Support](#support) gives supported models and quantization schemes, [Getting Started](#getting started) introduces the whole procedure to run this project including data preparation, quantization, evaluation to updated model export. [Evaluation](#evaluation) lists configs for each table in the paper for others to reproduce.
In the following sections, [Support](#support) gives supported models and quantization schemes, [Getting Started](#getting started) introduces the whole procedure to run this project including data preparation, quantization, evaluation and updated model export. [Evaluation](#evaluation) lists configs for each table in the paper for other researchers to reproduce.

## Support

Expand Down Expand Up @@ -146,7 +146,7 @@ quant:
symmetric: True # True: symmetric quantization, False: asymmetric one
ch_axis: -1 # 0: per-channel quantization, -1: per-layer one
calibrate: 128 # calibration size
calibrate_path: /mnt/lustre/weixiuying.vendor/datasets/nlp_datasets/pile_cali # calibration path, make sure there is _cali in the name
calibrate_path: /mnt/lustre/weixiuying.vendor/datasets/nlp_datasets/pile_cali # calibration dataset path, make sure there is _cali in the name
except_quantizer: null
is_remove_padding: True # True: remove [PAD] during calibration
migrate: True # True: shifting and scaling operations, False: no shifting and scaling operations.
Expand Down
4 changes: 2 additions & 2 deletions exp/opt/int4_group.yaml
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
quant:
a_qconfig:
quantizer: GroupFixedFakeQuantize
group_size: 512
group_size: 1024
observer: MinMaxObserver # EMAMSEObserver EMAMinMaxObserver EMAQuantileObserver EMAPruneMinMaxObserver
bit: 4
symmetric: False
ch_axis: 0
w_qconfig:
quantizer: GroupFixedQuantize
group_size: 512
group_size: 1024
observer: MinMaxObserver
bit: 4
symmetric: False
Expand Down

0 comments on commit 981fb7a

Please sign in to comment.