Skip to content

Commit

Permalink
IPEX Speculative Support for Baichuan2 7B (intel-analytics#10112)
Browse files Browse the repository at this point in the history
* IPEX Speculative Support for Baichuan2 7B

* fix license problems

* refine
  • Loading branch information
Uxito-Ada authored Feb 19, 2024
1 parent d12242c commit 5553f43
Show file tree
Hide file tree
Showing 4 changed files with 1,061 additions and 2 deletions.
16 changes: 14 additions & 2 deletions python/llm/example/CPU/Speculative-Decoding/baichuan2/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ First token latency x.xxxxs

### 4. Accelerate with BIGDL_OPT_IPEX

To accelerate speculative decoding on CPU, you can install our validated version of [IPEX 2.3.0+git0c63936](https://github.com/intel/intel-extension-for-pytorch/tree/0c63936d7a6740679987920367ae2e0cdb375b2e) by following steps: (Other versions of IPEX may have some conflicts and can not accelerate speculative decoding correctly.)
To accelerate speculative decoding on CPU, optionally, you can install our validated version of [IPEX 2.3.0+git0c63936](https://github.com/intel/intel-extension-for-pytorch/tree/0c63936d7a6740679987920367ae2e0cdb375b2e) by following steps: (Other versions of IPEX may have some conflicts and can not accelerate speculative decoding correctly.)

#### 4.1 Download IPEX installation script
```bash
Expand All @@ -89,7 +89,19 @@ bash compile_bundle.sh
pip install -r requirements.txt
```

After installed IPEX, you can set `BIGDL_OPT_IPEX=true` to get target model acceleration. Currently only `Baichuan2 13b` is supported.
#### 4.5 Run Baichuan2 Models with IPEX

After installed IPEX, **if the size of your Baichuan2 is 7B**, replace `modeling_baichuan.py` file under your model directory with `./baichaun2_7b_opt_ipex/modeling_baichuan.ipex`, like:

```bash
cp ./baichaun2_7b_opt_ipex/modeling_baichuan.ipex your_model_path/modeling_baichuan.py
```

And also replace `tokenization_baichuan.py` file under your model directory with `./baichaun2_7b_opt_ipex/tokenization_baichuan.py`.

**13B does not need the above operations, and please ignore.**

Then, you can set `BIGDL_OPT_IPEX=true` to get target model acceleration:

```bash
source bigdl-llm-init -t
Expand Down
Loading

0 comments on commit 5553f43

Please sign in to comment.