-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add Yi-6B and StableLM to iGPU perf test (#11546)
* Add transformer4.38.2 test to igpu benchmark (#11529) * add transformer4.38.1 test to igpu benchmark * use transformers4.38.2 & fix csv name error in 4.38 workflow * add model Yi-6B-Chat & remove temporarily most models --------- Co-authored-by: ATMxsp01 <[email protected]> * filter some errorlevel (#11541) Co-authored-by: ATMxsp01 <[email protected]> * Restore the temporarily removed models in iGPU-perf (#11544) * filter some errorlevel * restore the temporarily removed models in iGPU-perf --------- Co-authored-by: ATMxsp01 <[email protected]> --------- Co-authored-by: Xu, Shuo <[email protected]> Co-authored-by: ATMxsp01 <[email protected]>
- Loading branch information
1 parent
7dc6756
commit 8982ab7
Showing
11 changed files
with
194 additions
and
4 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
repo_id: | ||
- 'stabilityai/stablelm-zephyr-3b' | ||
#- 'google/gemma-7b-it' | ||
local_model_hub: 'path to your local model hub' | ||
warm_up: 1 | ||
num_trials: 3 | ||
num_beams: 1 # default to greedy search | ||
low_bit: 'sym_int4' # default to use 'sym_int4' (i.e. symmetric int4) | ||
batch_size: 1 # default to 1 | ||
in_out_pairs: | ||
- '1024-128' | ||
test_api: | ||
- "transformer_int4_gpu_win" # on Intel GPU for Windows (catch GPU peak memory) | ||
cpu_embedding: True # whether put embedding to CPU (only avaiable now for gpu win related test_api) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
14 changes: 14 additions & 0 deletions
14
python/llm/test/benchmark/igpu-perf/1024-128_int4_fp16_438.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
repo_id: | ||
- 'stabilityai/stablelm-zephyr-3b' | ||
#- 'google/gemma-7b-it' | ||
local_model_hub: 'path to your local model hub' | ||
warm_up: 1 | ||
num_trials: 3 | ||
num_beams: 1 # default to greedy search | ||
low_bit: 'sym_int4' # default to use 'sym_int4' (i.e. symmetric int4) | ||
batch_size: 1 # default to 1 | ||
in_out_pairs: | ||
- '1024-128' | ||
test_api: | ||
- "transformer_int4_fp16_gpu_win" # on Intel GPU for Windows, use fp16 for non-linear layer | ||
cpu_embedding: True # whether put embedding to CPU (only avaiable now for gpu win related test_api) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
14 changes: 14 additions & 0 deletions
14
python/llm/test/benchmark/igpu-perf/1024-128_int4_fp16_loadlowbit_438.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
repo_id: | ||
- 'stabilityai/stablelm-zephyr-3b' | ||
#- 'google/gemma-7b-it' | ||
local_model_hub: 'path to your local model hub' | ||
warm_up: 1 | ||
num_trials: 3 | ||
num_beams: 1 # default to greedy search | ||
low_bit: 'sym_int4' # default to use 'sym_int4' (i.e. symmetric int4) | ||
batch_size: 1 # default to 1 | ||
in_out_pairs: | ||
- '1024-128' | ||
test_api: | ||
- "transformer_int4_fp16_loadlowbit_gpu_win" # on Intel GPU for Windows (catch GPU peak memory) | ||
cpu_embedding: True # whether put embedding to CPU (only avaiable now for gpu win related test_api) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
14 changes: 14 additions & 0 deletions
14
python/llm/test/benchmark/igpu-perf/2048-256_int4_fp16_438.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
repo_id: | ||
- 'stabilityai/stablelm-zephyr-3b' | ||
#- 'google/gemma-7b-it' | ||
local_model_hub: 'path to your local model hub' | ||
warm_up: 1 | ||
num_trials: 3 | ||
num_beams: 1 # default to greedy search | ||
low_bit: 'sym_int4' # default to use 'sym_int4' (i.e. symmetric int4) | ||
batch_size: 1 # default to 1 | ||
in_out_pairs: | ||
- '2048-256' | ||
test_api: | ||
- "transformer_int4_fp16_gpu_win" # on Intel GPU for Windows (catch GPU peak memory) | ||
cpu_embedding: True # whether put embedding to CPU (only avaiable now for gpu win related test_api) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
14 changes: 14 additions & 0 deletions
14
python/llm/test/benchmark/igpu-perf/32-32_int4_fp16_438.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
repo_id: | ||
- 'stabilityai/stablelm-zephyr-3b' | ||
#- 'google/gemma-7b-it' | ||
local_model_hub: 'path to your local model hub' | ||
warm_up: 3 | ||
num_trials: 5 | ||
num_beams: 1 # default to greedy search | ||
low_bit: 'sym_int4' # default to use 'sym_int4' (i.e. symmetric int4) | ||
batch_size: 1 # default to 1 | ||
in_out_pairs: | ||
- '32-32' | ||
test_api: | ||
- "transformer_int4_fp16_gpu_win" # on Intel GPU for Windows (catch GPU peak memory) | ||
cpu_embedding: True # whether put embedding to CPU (only avaiable now for gpu win related test_api) |