You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
it run successfully if with numactl -m 0 -C 0-23.
it run failed if with numactl -m 0 -C 0-31, or 0-47 , or 0-55.
i can be reproduced with INT8_ASYM or 4BIT_MAXIMUM quantization
here the command to do quatization:
python3 convert.py --model_id $model_path -c $DATA_TYPE --output_dir $target_path
here is the command line to do inference
numactl -m 0 -C 0-47 python benchmark.py -m /app/savedmodels/THUDM/chatglm2-6b/pytorch/dldt/compressed_weights/OV_FP32-INT8_ASYM/ -d CPU -n 3 -p "It is done, and submitted. You can play 'Survival of the Tastiest' on Android, and on the web. Playing on the web works, but you have to simulate multiple touch for table moving and that can be a bit confusing. There is a lot I'd like to talk about. I will go through every topic, insted of making the typical what went right/wrong list. Concept Working over the theme was probably one of the hardest tasks which I had to face. Originally, I had an idea of what kind of game I wanted to develop, gameplay wise - something with a lot of enemies/actors, simple graphics, maybe set in space, controlled from a top-down view. I was confident that I could fit any theme around it. In the end, the problem with a theme like 'Evolution' in a game is that evolution is unassisted. It happens through several seemingly random mutations over time, with the most apt permutation surviving. This genetic car simulator is, in my opinion, a great example of actual evolution of a species facing a challenge. But is it a game? In a game, you need to control something to reach an objective. That control goes against what evolution is" -r /app/output/chatglm2-6b-4BIT_MAXIMUM-16-256-256.1.csv -ic 256 -mc 2 -bs 16 --torch_compile_backend openvino --fuse_decoding_strategy -od /app/output --genai
[ INFO ] ==SUCCESS FOUND==: use_case: text_gen, model_type: chatglm2-6b
[ INFO ] OV Config={'CACHE_DIR': ''}
[ INFO ] OPENVINO_TORCH_BACKEND_DEVICE=CPU
[ INFO ] Model path=/app/savedmodels/THUDM/chatglm2-6b/pytorch/dldt/compressed_weights/OV_FP32-INT8_ASYM, openvino runtime version: 2024.4.0-16579-c3152d32c9c-releases/2024/4
[ INFO ] Pipeline initialization time: 0.98s
[ INFO ] Numbeams: 1, benchmarking iter nums(exclude warm-up): 3, prompt nums: 1, prompt idx: [0]
[ INFO ] [warm-up] Input text: It is done, and submitted. You can play 'Survival of the Tastiest' on Android, and on the web. Playing on the web works, but you have to simulate multiple touch for table moving and that can be a bit confusing. There is a lot I'd like to talk about. I will go through every topic, insted of making the typical what went right/wrong list. Concept Working over the theme was probably one of the hardest tasks which I had to face. Originally, I had an idea of what kind of game I wanted to develop, gameplay wise - something with a lot of enemies/actors, simple graphics, maybe set in space, controlled from a top-down view. I was confident that I could fit any theme around it. In the end, the problem with a theme like 'Evolution' in a game is that evolution is unassisted. It happens through several seemingly random mutations over time, with the most apt permutation surviving. This genetic car simulator is, in my opinion, a great example of actual evolution of a species facing a challenge. But is it a game? In a game, you need to control something to reach an objective. That control goes against what evolution is
[ INFO ] [warm-up] Batch_size=16, all input token size after padding: 256 * 16, all max_output_token_size: 256 * 16
[ ERROR ] An exception occurred
[ INFO ] Traceback (most recent call last):
File "/app/benchmark.py", line 856, in main
iter_data_list, pretrain_time = CASE_TO_BENCH[model_args['use_case']](model_path, framework, args.device, model_args, args.num_iters)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/benchmark.py", line 462, in run_text_generation_benchmark
text_gen_fn(input_text, num, model, tokenizer, args, iter_data_list, md5_list, prompt_idx_list[idx], bench_hook, model_precision, proc_id)
File "/app/benchmark.py", line 348, in run_text_generation_genai
generation_result = model.generate(input_text_list, max_new_tokens=max_gen_tokens, num_beams=args["num_beams"])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Exception from src/inference/src/cpp/infer_request.cpp:223:
Exception from src/plugins/intel_cpu/src/graph.cpp:1243:
Node __module.transformer.encoder.layers.0.self_attention.core_attention/aten::scaled_dot_product_attention/ScaledDotProductAttention of type ScaledDotProductAttentionWithKVCache
Check 'B == B_state' failed at src/plugins/intel_cpu/src/nodes/scaled_attn.cpp:1393:
beam idx batch: 14 is not equal to batch of state: 16
The text was updated successfully, but these errors were encountered:
i saw the issue with chatglm2-6b.
it run successfully if with numactl -m 0 -C 0-23.
it run failed if with numactl -m 0 -C 0-31, or 0-47 , or 0-55.
i can be reproduced with INT8_ASYM or 4BIT_MAXIMUM quantization
here the command to do quatization:
python3 convert.py --model_id $model_path -c $DATA_TYPE --output_dir $target_path
here is the command line to do inference
numactl -m 0 -C 0-47 python benchmark.py -m /app/savedmodels/THUDM/chatglm2-6b/pytorch/dldt/compressed_weights/OV_FP32-INT8_ASYM/ -d CPU -n 3 -p "It is done, and submitted. You can play 'Survival of the Tastiest' on Android, and on the web. Playing on the web works, but you have to simulate multiple touch for table moving and that can be a bit confusing. There is a lot I'd like to talk about. I will go through every topic, insted of making the typical what went right/wrong list. Concept Working over the theme was probably one of the hardest tasks which I had to face. Originally, I had an idea of what kind of game I wanted to develop, gameplay wise - something with a lot of enemies/actors, simple graphics, maybe set in space, controlled from a top-down view. I was confident that I could fit any theme around it. In the end, the problem with a theme like 'Evolution' in a game is that evolution is unassisted. It happens through several seemingly random mutations over time, with the most apt permutation surviving. This genetic car simulator is, in my opinion, a great example of actual evolution of a species facing a challenge. But is it a game? In a game, you need to control something to reach an objective. That control goes against what evolution is" -r /app/output/chatglm2-6b-4BIT_MAXIMUM-16-256-256.1.csv -ic 256 -mc 2 -bs 16 --torch_compile_backend openvino --fuse_decoding_strategy -od /app/output --genai
[ INFO ] ==SUCCESS FOUND==: use_case: text_gen, model_type: chatglm2-6b
[ INFO ] OV Config={'CACHE_DIR': ''}
[ INFO ] OPENVINO_TORCH_BACKEND_DEVICE=CPU
[ INFO ] Model path=/app/savedmodels/THUDM/chatglm2-6b/pytorch/dldt/compressed_weights/OV_FP32-INT8_ASYM, openvino runtime version: 2024.4.0-16579-c3152d32c9c-releases/2024/4
[ INFO ] Pipeline initialization time: 0.98s
[ INFO ] Numbeams: 1, benchmarking iter nums(exclude warm-up): 3, prompt nums: 1, prompt idx: [0]
[ INFO ] [warm-up] Input text: It is done, and submitted. You can play 'Survival of the Tastiest' on Android, and on the web. Playing on the web works, but you have to simulate multiple touch for table moving and that can be a bit confusing. There is a lot I'd like to talk about. I will go through every topic, insted of making the typical what went right/wrong list. Concept Working over the theme was probably one of the hardest tasks which I had to face. Originally, I had an idea of what kind of game I wanted to develop, gameplay wise - something with a lot of enemies/actors, simple graphics, maybe set in space, controlled from a top-down view. I was confident that I could fit any theme around it. In the end, the problem with a theme like 'Evolution' in a game is that evolution is unassisted. It happens through several seemingly random mutations over time, with the most apt permutation surviving. This genetic car simulator is, in my opinion, a great example of actual evolution of a species facing a challenge. But is it a game? In a game, you need to control something to reach an objective. That control goes against what evolution is
[ INFO ] [warm-up] Batch_size=16, all input token size after padding: 256 * 16, all max_output_token_size: 256 * 16
[ ERROR ] An exception occurred
[ INFO ] Traceback (most recent call last):
File "/app/benchmark.py", line 856, in main
iter_data_list, pretrain_time = CASE_TO_BENCH[model_args['use_case']](model_path, framework, args.device, model_args, args.num_iters)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/benchmark.py", line 462, in run_text_generation_benchmark
text_gen_fn(input_text, num, model, tokenizer, args, iter_data_list, md5_list, prompt_idx_list[idx], bench_hook, model_precision, proc_id)
File "/app/benchmark.py", line 348, in run_text_generation_genai
generation_result = model.generate(input_text_list, max_new_tokens=max_gen_tokens, num_beams=args["num_beams"])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Exception from src/inference/src/cpp/infer_request.cpp:223:
Exception from src/plugins/intel_cpu/src/graph.cpp:1243:
Node __module.transformer.encoder.layers.0.self_attention.core_attention/aten::scaled_dot_product_attention/ScaledDotProductAttention of type ScaledDotProductAttentionWithKVCache
Check 'B == B_state' failed at src/plugins/intel_cpu/src/nodes/scaled_attn.cpp:1393:
beam idx batch: 14 is not equal to batch of state: 16
The text was updated successfully, but these errors were encountered: