-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* torch 2.3 inference docker * Update README.md * add convert code * rename image * remove 2.1 and add graph example * Update README.md
- Loading branch information
Showing
4 changed files
with
352 additions
and
6 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
# Torch Graph Mode | ||
|
||
Here, we provide how to run [torch graph mode](https://pytorch.org/blog/optimizing-production-pytorch-performance-with-graph-transformations/) on Intel Arc™ A-Series Graphics with ipex-llm, and [gpt2-medium](https://huggingface.co/openai-community/gpt2-medium) for classification task is used as illustration: | ||
|
||
### 1. Install | ||
```bash | ||
conda create -n ipex-llm python=3.11 | ||
conda activate ipex-llm | ||
pip install --pre --upgrade ipex-llm[xpu_arc] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/ | ||
pip install --pre pytorch-triton-xpu==3.0.0+1b2f15840e --index-url https://download.pytorch.org/whl/nightly/xpu | ||
conda install -c conda-forge libstdcxx-ng | ||
unset OCL_ICD_VENDORS | ||
``` | ||
|
||
### 2. Configures OneAPI environment variables | ||
|
||
> [!NOTE] | ||
> Skip this step if you are running on Windows. | ||
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI. | ||
|
||
```bash | ||
source /opt/intel/oneapi/setvars.sh | ||
``` | ||
|
||
### 3. Run | ||
|
||
Convert text-generating GPT2-Medium to the classification: | ||
|
||
```bash | ||
# The convert step needs to access the internet | ||
export http_proxy=http://your_proxy_url | ||
export https_proxy=http://your_proxy_url | ||
|
||
# This will yield gpt2-medium-classification under /llm/models in the container | ||
python convert-model-textgen-to-classfication.py --model-path MODEL_PATH | ||
``` | ||
|
||
This will yield a mode directory ends with '-classification' neart your input model path. | ||
|
||
Benchmark GPT2-Medium's performance with IPEX-LLM engine: | ||
|
||
``` sbash | ||
ipexrun xpu gpt2-graph-mode-benchmark.py --device xpu --engine ipex-llm --batch 16 --model-path MODEL_PATH | ||
# You will see the key output like: | ||
# Average time taken (excluding the first two loops): xxxx seconds, Classification per seconds is xxxx | ||
``` |
57 changes: 57 additions & 0 deletions
57
python/llm/example/GPU/GraphMode/convert-model-textgen-to-classfication.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,57 @@ | ||
# | ||
# Copyright 2016 The BigDL Authors. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
# This is modified from https://github.com/intel-sandbox/customer-ai-test-code/blob/main/convert-model-textgen-to-classfication.py | ||
# | ||
import torch | ||
from transformers import AutoTokenizer, AutoModelForSequenceClassification, AutoConfig, AutoModelForCausalLM | ||
import argparse | ||
|
||
parser = argparse.ArgumentParser(description='Process some integers.') | ||
parser.add_argument('--model_path', type=str, help='an string for the device') | ||
args = parser.parse_args() | ||
model_path = args.model_path | ||
|
||
dtype=torch.bfloat16 | ||
num_labels = 5 | ||
|
||
model_name=model_path | ||
|
||
save_directory = model_name + "-classification" | ||
|
||
# Initialize the tokenizer | ||
# Need padding from the left and padding to 1024 | ||
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True) | ||
# tokenizer.padding_side = "left" | ||
tokenizer.pad_token = tokenizer.eos_token | ||
tokenizer.save_pretrained(save_directory) | ||
|
||
|
||
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=dtype, pad_token_id=tokenizer.eos_token_id,) | ||
config = AutoConfig.from_pretrained(model_name) | ||
print("text gen model") | ||
print(model) | ||
print(config) | ||
|
||
|
||
model = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels=num_labels, torch_dtype=dtype) | ||
save_directory = model_name + "-classification" | ||
model.save_pretrained(save_directory) | ||
|
||
|
||
model = AutoModelForSequenceClassification.from_pretrained(save_directory, torch_dtype=dtype, pad_token_id=tokenizer.eos_token_id) | ||
config = AutoConfig.from_pretrained(save_directory) | ||
print("text classification model") | ||
print(model) | ||
print(config) |
Oops, something went wrong.