Skip to content

Commit

Permalink
fix npu llama2 (intel-analytics#11471)
Browse files Browse the repository at this point in the history
  • Loading branch information
MeouSker77 committed Jul 19, 2024
1 parent b3a7fe3 commit 10fd10d
Show file tree
Hide file tree
Showing 2 changed files with 3 additions and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,8 @@ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-exte

# below command will install intel_npu_acceleration_library
pip install intel-npu-acceleration-library==1.3

pip install transformers==4.40
```

### 2. Runtime Configurations
Expand Down
2 changes: 1 addition & 1 deletion python/llm/src/ipex_llm/transformers/npu_models/llama.py
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ def llama_model_forward(
from ipex_llm.transformers.kv import DynamicNormalCache
if use_cache and not isinstance(past_key_values, DynamicNormalCache):
past_key_values = DynamicNormalCache.from_legacy_cache(past_key_values)
past_seen_tokens = past_key_values.set_seq_length()
past_seen_tokens = past_key_values.get_seq_length()

if cache_position is None:
cache_position = torch.arange(past_seen_tokens, past_seen_tokens + inputs_embeds.shape[1],
Expand Down

0 comments on commit 10fd10d

Please sign in to comment.