Add MultiPruner results and improvements to install and readme

Co-authored-by: Yuan, Jinjie <[email protected]>
IntelLabs · Dec 12, 2024 · ea31bab · ea31bab
1 parent 0b81417
commit ea31bab
Show file tree

Hide file tree

Showing 19 changed files with 798 additions and 31 deletions.
diff --git a/MultiPruner/README.md b/MultiPruner/README.md
@@ -3,22 +3,22 @@
 Official implementation of [Fine-Grained Training-Free Structure Removal in Foundation Models]().
 
 This repo contains the code for **MultiPruner**, a novel pruning approach that surpasses recent training-free pruning 
-methods by adopting a multidimensional, iterative, fine-grained pruning strategy.
+methods, e.g., BlockPruner (Zhong el al., 2024) and ShortGPT (Men et al., 2024), by adopting a multidimensional, iterative, fine-grained pruning strategy.
 Please refer to our paper for more details.
 
 ## News
-- **[2025.xx.xx]** Release the code for **MultiPruner**. :tada:
+- **[2024.12.14]** Release the code for **MultiPruner**. :tada:
+
+## Supported Models
+
+- Llama: [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf)
+- Qwen: [Qwen/Qwen2.5-7B](https://huggingface.co/Qwen/Qwen2.5-7B), [Qwen/Qwen1.5-7B](https://huggingface.co/Qwen/Qwen1.5-7B)
 
 ## Setup
 
-Here is an installation script developed from scratch.
+Use the following instructions to create a virtual environment with the required dependencies.
 
 ```
-pip install virtualenv
-virtualenv multipruner-env
-source multipruner-env/bin/activate
-pip install torch==2.3.1
-
 # install dependencies
 bash install.sh
 ```
@@ -115,32 +115,42 @@ This investigation may facilitate practical applications. The results of Llama-2
 | MultiPruner w/ finetune  | 18%           | 66.16    | -2.80%    | 95.94%        |
 
 
-## Released Pruned Models 🤗
+## Released Pruned Models and Configurations 🤗
 
-We have released several compressed models by MultiPruner:
+We have released several compressed models or pruning configurations to reproduce the results in the paper:
 
-| Source Model                                                                            | Pruning Ratio | Recovery Tuning | Pruned Model                                                                                                  |
-|-----------------------------------------------------------------------------------------|---------------|-----------------|---------------------------------------------------------------------------------------------------------------|
-| [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf)             | 7%            | ✘               | [IntelLabs/MultiPruner-Llama-2-6.3b](https://huggingface.co/IntelLabs/MultiPruner-Llama-2-6.3b)               |
-| [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf)             | 10%           | ✘               | [IntelLabs/MultiPruner-Llama-2-6.1b](https://huggingface.co/IntelLabs/MultiPruner-Llama-2-6.1b)               |
-| [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf)             | 12%           | ✘               | [IntelLabs/MultiPruner-Llama-2-5.9b](https://huggingface.co/IntelLabs/MultiPruner-Llama-2-5.9b)               |
-| [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf)             | 12%           | ✔               | [IntelLabs/MultiPruner-Llama-2-5.9b-alpaca](https://huggingface.co/IntelLabs/MultiPruner-Llama-2-5.9b-alpaca) |
-| [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf)             | 14%           | ✘               | [IntelLabs/MultiPruner-Llama-2-5.8b](https://huggingface.co/IntelLabs/MultiPruner-Llama-2-5.8b)               |
-| [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf)             | 15%           | ✘               | [IntelLabs/MultiPruner-Llama-2-5.7b](https://huggingface.co/IntelLabs/MultiPruner-Llama-2-5.7b)               |
-| [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf)             | 15%           | ✔               | [IntelLabs/MultiPruner-Llama-2-5.7b-alpaca](https://huggingface.co/IntelLabs/MultiPruner-Llama-2-5.7b-alpaca) |
-| [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf)             | 18%           | ✘               | [IntelLabs/MultiPruner-Llama-2-5.5b](https://huggingface.co/IntelLabs/MultiPruner-Llama-2-5.5b)               |
-| [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf)             | 18%           | ✔               | [IntelLabs/MultiPruner-Llama-2-5.5b-alpaca](https://huggingface.co/IntelLabs/MultiPruner-Llama-2-5.5b-alpaca) |
-| [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf)             | 22%           | ✘               | [IntelLabs/MultiPruner-Llama-2-5.3b](https://huggingface.co/IntelLabs/MultiPruner-Llama-2-5.3b)               |
-| [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf)             | 22%           | ✔               | [IntelLabs/MultiPruner-Llama-2-5.3b-alpaca](https://huggingface.co/IntelLabs/MultiPruner-Llama-2-5.3b-alpaca) |
-| [Qwen/Qwen1.5-7B](https://huggingface.co/Qwen/Qwen1.5-7B)                               | 22%           | ✘               | [IntelLabs/MultiPruner-Qwen1.5-6b](https://huggingface.co/IntelLabs/MultiPruner-Qwen1.5-6b)                   |
-| [baichuan-inc/Baichuan2-7B-Base](https://huggingface.co/baichuan-inc/Baichuan2-7B-Base) | 22%           | ✘               | [IntelLabs/MultiPruner-Baichuan2-5.8b](https://huggingface.co/IntelLabs/MultiPruner-Baichuan2-5.8b)           |
+| Source Model                                                                            | Pruning Ratio | Pruned Model Configuration / HF link                                                                |
+|-----------------------------------------------------------------------------------------|-----------------|-----------------------------------------------------------------------------------------------------|
+| [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf)             | 7%            | [MultiPruner-Llama-2-6.3b Config File](./results/Llama-2-7B/ratio_7)                                |
+| [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf)             | 10%           | [MultiPruner-Llama-2-6.3b Config File](./results/Llama-2-7B/ratio_10)                               |
+| [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf)             | 12%           | [MultiPruner-Llama-2-6.3b Config File](./results/Llama-2-7B/ratio_12)                               |
+| [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf)             | 14%           | [MultiPruner-Llama-2-6.3b Config File](./results/Llama-2-7B/ratio_14)                               |
+| [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf)             | 15%           | [MultiPruner-Llama-2-6.3b Config File](./results/Llama-2-7B/ratio_15)                               |
+| [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf)             | 18%           | [MultiPruner-Llama-2-6.3b Config File](./results/Llama-2-7B/ratio_18)                               |
+| [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf)             | 22%           | [MultiPruner-Llama-2-6.3b Config File](./results/Llama-2-7B/ratio_22)                               |
+| [Qwen/Qwen1.5-7B](https://huggingface.co/Qwen/Qwen1.5-7B)                               | 22%           | [IntelLabs/MultiPruner-Qwen1.5-6b](https://huggingface.co/IntelLabs/MultiPruner-Qwen1.5-6b)         |
+| [baichuan-inc/Baichuan2-7B-Base](https://huggingface.co/baichuan-inc/Baichuan2-7B-Base) | 22%           | [IntelLabs/MultiPruner-Baichuan2-5.8b](https://huggingface.co/IntelLabs/MultiPruner-Baichuan2-5.8b) |
+<sup>*</sup> *For Llama models, we provide the pruning configuration files to reproduce the results in the paper.*
 
 ### Loading the compressed model for evaluation
 
 ```bash
 python eval.py --model_path <path to compressed model> --output_path <path to evaluation results>
 ```
 
+## Acknowledgement
+
+MultiPruner benefits from the following work:
+
+```bibtex
+@article{zhong2024blockpruner,
+  title={BlockPruner: Fine-grained Pruning for Large Language Models},
+  author={Zhong, Longguang and Wan, Fanqi and Chen, Ruijun and Quan, Xiaojun and Li, Liangzhi},
+  journal={arXiv preprint arXiv:2406.10594},
+  year={2024}
+}
+```
+
 ## Citation
 If you find MultiPruner's code and paper helpful, please kindly cite:
 ```bibtex

diff --git a/MultiPruner/install.sh b/MultiPruner/install.sh
@@ -3,13 +3,22 @@ set -e
 set -x
 
 MULTIPRUNER_PATH=$PWD
-mkdir third_party && cd third_party
 
-pip install 'numpy<2.0.0' setuptools==69.5.1
+python3.10 -m venv venv
+source venv/bin/activate
+
+mkdir -pv third_party
+pushd third_party
 
-mkdir third_party && cd third_party
 git clone https://github.com/huggingface/transformers.git
-cd transformers && git checkout v4.42.4 && git apply --ignore-space-change --ignore-whitespace ${MULTIPRUNER_PATH}/patches/transformers-v4.42.4.patch && pip install -e . && cd ..
+pushd transformers
+git checkout v4.42.4
+git apply --ignore-space-change --ignore-whitespace ${MULTIPRUNER_PATH}/patches/transformers-v4.42.4.patch
+pip install -e .
+
+pushd ${MULTIPRUNER_PATH}
+
+pip install -r requirements.txt
+
+echo "Environment all ready.  execute 'source venv/bin/activate' to run"
 
-pip install datasets accelerate sentencepiece protobuf bitsandbytes
-pip install lm-eval==0.4.2
diff --git a/MultiPruner/requirements.txt b/MultiPruner/requirements.txt
@@ -0,0 +1,9 @@
+numpy<2.0.0
+setuptools==69.5.1
+datasets 
+accelerate 
+sentencepiece 
+protobuf 
+bitsandbytes
+lm-eval==0.4.2
+torch==2.3.1
diff --git a/MultiPruner/results/Llama-2-7B/ratio_10/eval.res.json b/MultiPruner/results/Llama-2-7B/ratio_10/eval.res.json
@@ -0,0 +1,12 @@
+{
+    "total_params": 6738415616,
+    "pruned_params": 6063132672,
+    "ratio": 10.02139052385812,
+    "ppl_wikitext2": 6.55,
+    "5cs_acc_avg": 67.02,
+    "arc_challenge": 44.45,
+    "arc_easy": 71.0,
+    "hellaswag": 74.07000000000001,
+    "winogrande": 68.19,
+    "piqa": 77.37
+}
diff --git a/MultiPruner/results/Llama-2-7B/ratio_10/pruning_config.json b/MultiPruner/results/Llama-2-7B/ratio_10/pruning_config.json
@@ -0,0 +1,78 @@
+{
+    "pruned_attn_idx": [
+        25,
+        27,
+        21,
+        23,
+        24
+    ],
+    "pruned_mlp_idx": [],
+    "pruned_attn_width": {
+        "0": 4096,
+        "1": 3840,
+        "2": 3840,
+        "3": 4096,
+        "4": 4096,
+        "5": 3968,
+        "6": 4096,
+        "7": 4096,
+        "8": 3968,
+        "9": 4096,
+        "10": 4096,
+        "11": 4096,
+        "12": 4096,
+        "13": 4096,
+        "14": 4096,
+        "15": 4096,
+        "16": 4096,
+        "17": 3968,
+        "18": 4096,
+        "19": 3968,
+        "20": 3968,
+        "21": 4096,
+        "22": 3968,
+        "23": 4096,
+        "24": 4096,
+        "25": 4096,
+        "26": 4096,
+        "27": 4096,
+        "28": 3968,
+        "29": 4096,
+        "30": 3968,
+        "31": 4096
+    },
+    "pruned_mlp_width": {
+        "0": 11008,
+        "1": 11008,
+        "2": 5888,
+        "3": 11008,
+        "4": 11008,
+        "5": 11008,
+        "6": 11008,
+        "7": 9984,
+        "8": 11008,
+        "9": 11008,
+        "10": 11008,
+        "11": 9984,
+        "12": 11008,
+        "13": 11008,
+        "14": 11008,
+        "15": 11008,
+        "16": 11008,
+        "17": 11008,
+        "18": 11008,
+        "19": 11008,
+        "20": 11008,
+        "21": 11008,
+        "22": 11008,
+        "23": 1792,
+        "24": 11008,
+        "25": 11008,
+        "26": 11008,
+        "27": 1792,
+        "28": 11008,
+        "29": 11008,
+        "30": 11008,
+        "31": 11008
+    }
+}
diff --git a/MultiPruner/results/Llama-2-7B/ratio_12/eval.res.json b/MultiPruner/results/Llama-2-7B/ratio_12/eval.res.json
@@ -0,0 +1,12 @@
+{
+    "total_params": 6738415616,
+    "pruned_params": 5931012096,
+    "ratio": 11.982097365482536,
+    "ppl_wikitext2": 7.1,
+    "5cs_acc_avg": 66.47999999999999,
+    "arc_challenge": 44.03,
+    "arc_easy": 69.82000000000001,
+    "hellaswag": 73.77,
+    "winogrande": 68.43,
+    "piqa": 76.33
+}
diff --git a/MultiPruner/results/Llama-2-7B/ratio_12/pruning_config.json b/MultiPruner/results/Llama-2-7B/ratio_12/pruning_config.json
@@ -0,0 +1,79 @@
+{
+    "pruned_attn_idx": [
+        25,
+        27,
+        21,
+        23,
+        24,
+        29
+    ],
+    "pruned_mlp_idx": [],
+    "pruned_attn_width": {
+        "0": 4096,
+        "1": 4096,
+        "2": 3840,
+        "3": 3968,
+        "4": 4096,
+        "5": 4096,
+        "6": 4096,
+        "7": 4096,
+        "8": 3968,
+        "9": 4096,
+        "10": 4096,
+        "11": 4096,
+        "12": 4096,
+        "13": 4096,
+        "14": 4096,
+        "15": 4096,
+        "16": 3968,
+        "17": 3968,
+        "18": 4096,
+        "19": 3968,
+        "20": 3968,
+        "21": 4096,
+        "22": 3968,
+        "23": 4096,
+        "24": 4096,
+        "25": 4096,
+        "26": 4096,
+        "27": 4096,
+        "28": 3712,
+        "29": 4096,
+        "30": 3968,
+        "31": 4096
+    },
+    "pruned_mlp_width": {
+        "0": 11008,
+        "1": 11008,
+        "2": 5888,
+        "3": 11008,
+        "4": 11008,
+        "5": 11008,
+        "6": 11008,
+        "7": 9984,
+        "8": 11008,
+        "9": 11008,
+        "10": 11008,
+        "11": 11008,
+        "12": 11008,
+        "13": 11008,
+        "14": 11008,
+        "15": 11008,
+        "16": 11008,
+        "17": 11008,
+        "18": 11008,
+        "19": 11008,
+        "20": 11008,
+        "21": 11008,
+        "22": 11008,
+        "23": 1792,
+        "24": 11008,
+        "25": 1792,
+        "26": 11008,
+        "27": 4864,
+        "28": 11008,
+        "29": 11008,
+        "30": 11008,
+        "31": 11008
+    }
+}
diff --git a/MultiPruner/results/Llama-2-7B/ratio_14/eval.res.json b/MultiPruner/results/Llama-2-7B/ratio_14/eval.res.json
@@ -0,0 +1,12 @@
+{
+    "total_params": 6738415616,
+    "pruned_params": 5796794368,
+    "ratio": 13.973926537926385,
+    "ppl_wikitext2": 7.56,
+    "5cs_acc_avg": 65.93,
+    "arc_challenge": 43.519999999999996,
+    "arc_easy": 68.64,
+    "hellaswag": 72.27,
+    "winogrande": 67.96,
+    "piqa": 77.25999999999999
+}