fix lm_eval issue of llama #1606

sywangyi · 2024-12-13T01:23:40Z

fix following command

QUANT_CONFIG=./quantization_config/maxabs_measure.json python run_lm_eval.py -o acc_llama-2_bs1_measure.txt  --model_name_or_path meta-llama/Llama-2-7b-hf --use_hpu_graphs --use_kv_cache --max_new_tokens 100 --batch_size 1 --trim_logits --reuse_cache --bf16

coredump like

Traceback (most recent call last):
  File "/workspace/wangyi/optimum-habana/examples/text-generation/run_lm_eval.py", line 239, in <module>
    main()
  File "/workspace/wangyi/optimum-habana/examples/text-generation/run_lm_eval.py", line 209, in main
    lm = HabanaModelAdapter(tokenizer, model, args, generation_config)
  File "/workspace/wangyi/optimum-habana/examples/text-generation/run_lm_eval.py", line 136, in __init__
    self.warm_up()
  File "/workspace/wangyi/optimum-habana/examples/text-generation/run_lm_eval.py", line 141, in warm_up
    self._model_call(inps)
  File "/workspace/wangyi/optimum-habana/examples/text-generation/run_lm_eval.py", line 187, in _model_call
    logits = self.model(inps.to(self._device), **self.model_inputs)["logits"].cpu()
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1556, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1565, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/habana_frameworks/torch/hpu/graphs.py", line 726, in forward
    return wrapped_hpugraph_forward(
  File "/usr/local/lib/python3.10/dist-packages/habana_frameworks/torch/hpu/graphs.py", line 599, in wrapped_hpugraph_forward
    outputs = orig_fwd(*args, **kwargs)
  File "/workspace/wangyi/optimum-habana/optimum/habana/transformers/models/llama/modeling_llama.py", line 1341, in forward
    outputs = self.model(
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1556, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1606, in _call_impl
    result = forward_call(*args, **kwargs)
  File "/workspace/wangyi/optimum-habana/optimum/habana/transformers/models/llama/modeling_llama.py", line 1231, in forward
    layer_outputs = decoder_layer(
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1556, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1606, in _call_impl
    result = forward_call(*args, **kwargs)
  File "/workspace/wangyi/optimum-habana/optimum/habana/transformers/models/llama/modeling_llama.py", line 920, in forward
    hidden_states, self_attn_weights, present_key_value = self.pre_attn(
  File "/workspace/wangyi/optimum-habana/optimum/habana/transformers/models/llama/modeling_llama.py", line 977, in pre_attn
    hidden_states, attn_weights, present_key_value = self.self_attn.pre_attn_forward(
  File "/workspace/wangyi/optimum-habana/optimum/habana/transformers/models/llama/modeling_llama.py", line 719, in pre_attn_forward
    attn_weights = attn_weights + causal_mask
RuntimeError: Incompatible input shapes, broadcast not possible. Tensor1 Size: 769 384 32 1 Tensor2 Size: 384 384 1 1

Signed-off-by: Wang, Yi A <[email protected]>

github-actions · 2024-12-13T01:24:19Z

The code quality check failed, please run make style.

Signed-off-by: Wang, Yi A <[email protected]>

sywangyi · 2024-12-13T01:25:45Z

@jiminha pls help review

HuggingFaceDocBuilderDev · 2024-12-13T01:29:39Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

github-actions · 2024-12-13T10:48:34Z

The code quality check failed, please run make style.

Signed-off-by: Wang, Yi A <[email protected]>

fix lm_eval issue of llama

2d93bfa

Signed-off-by: Wang, Yi A <[email protected]>

sywangyi requested review from mandy-li and libinta as code owners December 13, 2024 01:23

fmt

00f269b

Signed-off-by: Wang, Yi A <[email protected]>

Merge branch 'main' into lm_eval_llama

0f1ee4e

fmt

65b8a7c

Signed-off-by: Wang, Yi A <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix lm_eval issue of llama #1606

fix lm_eval issue of llama #1606

sywangyi commented Dec 13, 2024

github-actions bot commented Dec 13, 2024

sywangyi commented Dec 13, 2024

HuggingFaceDocBuilderDev commented Dec 13, 2024

github-actions bot commented Dec 13, 2024

fix lm_eval issue of llama #1606

Are you sure you want to change the base?

fix lm_eval issue of llama #1606

Conversation

sywangyi commented Dec 13, 2024

github-actions bot commented Dec 13, 2024

sywangyi commented Dec 13, 2024

HuggingFaceDocBuilderDev commented Dec 13, 2024

github-actions bot commented Dec 13, 2024