-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Modify the check_results.py to support batch 2&4 (#11133)
* add batch 2&4 and exclude to perf_test * modify the perf-test&437 yaml * modify llm_performance_test.yml * remove batch 4 * modify check_results.py to support batch 2&4 * change the batch_size format * remove genxir * add str(batch_size) * change actual_test_casese in check_results file to support batch_size * change html highlight * less models to test html and html_path * delete the moe model * split batch html * split * use installing from pypi * use installing from pypi - batch2 * revert cpp * revert cpp * merge two jobs into one, test batch_size in one job * merge two jobs into one, test batch_size in one job * change file directory in workflow * try catch deal with odd file without batch_size * modify pandas version * change the dir * organize the code * organize the code * remove Qwen-MOE * modify based on feedback * modify based on feedback * modify based on second round of feedback * modify based on second round of feedback + change run-arc.sh mode * modify based on second round of feedback + revert config * modify based on second round of feedback + revert config * modify based on second round of feedback + remove comments * modify based on second round of feedback + remove comments * modify based on second round of feedback + revert arc-perf-test * modify based on third round of feedback * change error type * change error type * modify check_results.html * split batch into two folders * add all models * move csv_name * revert pr test * revert pr test --------- Co-authored-by: Yishuo Wang <[email protected]>
- Loading branch information
1 parent
dc4fea7
commit 231b968
Showing
6 changed files
with
152 additions
and
32 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
repo_id: | ||
- 'meta-llama/Llama-2-7b-chat-hf' | ||
- 'meta-llama/Llama-2-13b-chat-hf' | ||
- 'THUDM/chatglm2-6b' | ||
- 'THUDM/chatglm3-6b-4bit' | ||
- 'tiiuae/falcon-7b-instruct-with-patch' | ||
- 'mosaicml/mpt-7b-chat' | ||
- 'redpajama/gptneox-7b-redpajama-bf16' | ||
- 'bigcode/starcoder-15.5b-4bit' | ||
- 'databricks/dolly-v1-6b' | ||
- 'databricks/dolly-v2-7b' | ||
- 'databricks/dolly-v2-12b' | ||
- 'internlm/internlm-chat-7b' | ||
- 'Qwen/Qwen-7B-Chat' | ||
- 'BAAI/AquilaChat-7B' | ||
- 'baichuan-inc/Baichuan2-7B-Chat' | ||
- 'baichuan-inc/Baichuan2-13B-Chat-4bit' | ||
- 'bigscience/bloomz-7b1' | ||
# - 'fnlp/moss-moon-003-sft-4bit' # moss-moon-003-sft cannot work on transformers 4.34+ | ||
- 'mistralai/Mistral-7B-v0.1' | ||
local_model_hub: '/mnt/disk1/models' | ||
warm_up: 1 | ||
num_trials: 3 | ||
num_beams: 1 # default to greedy search | ||
low_bit: 'sym_int4' # default to use 'sym_int4' (i.e. symmetric int4) | ||
batch_size: 2 # default to 1 | ||
in_out_pairs: | ||
- '32-32' | ||
- '1024-128' | ||
- '2048-256' | ||
test_api: | ||
- "transformer_int4_gpu" # on Intel GPU | ||
cpu_embedding: False # whether put embedding to CPU (only avaiable now for gpu win related test_api) | ||
exclude: | ||
- 'bigcode/starcoder-15.5b-4bit:2048' | ||
- 'databricks/dolly-v2-12b:2048' | ||
- 'baichuan-inc/Baichuan2-13B-Chat-4bit:2048' | ||
- 'bigscience/bloomz-7b1:2048' |
19 changes: 19 additions & 0 deletions
19
python/llm/test/benchmark/arc-perf-transformers-437-batch2.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
# For the models that require transformers 4.37.0 | ||
repo_id: | ||
- 'Qwen/Qwen1.5-7B-Chat' | ||
- 'microsoft/phi-2' | ||
- 'microsoft/Phi-3-mini-4k-instruct' | ||
- 'meta-llama/Meta-Llama-3-8B-Instruct' | ||
local_model_hub: '/mnt/disk1/models' | ||
warm_up: 1 | ||
num_trials: 3 | ||
num_beams: 1 # default to greedy search | ||
low_bit: 'sym_int4' # default to use 'sym_int4' (i.e. symmetric int4) | ||
batch_size: 2 # default to 1 | ||
in_out_pairs: | ||
- '32-32' | ||
- '1024-128' | ||
- '2048-256' | ||
test_api: | ||
- "transformer_int4_gpu" # on Intel GPU | ||
cpu_embedding: False # whether put embedding to CPU (only avaiable now for gpu win related test_api) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters