Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compatibility with transformers > 4.36: error: AttributeError: 'tuple' object has no attribute 'to_legacy_cache' #137

Open
Dr-Left opened this issue Aug 21, 2024 · 2 comments

Comments

@Dr-Left
Copy link

Dr-Left commented Aug 21, 2024

This is mentioned in the transformers library. https://github.com/huggingface/transformers/issues/28003 & https://github.com/huggingface/transformers/issues/28045

I used transformers==4.43.3 with tensor_parallel==2.0.0, and I loaded the Llama-3.1-8B-Instruct model. When I am doing inference, there is an error:

Traceback (most recent call last):
  File "/home/bc20/jingwei/topk/./exploration/eval/save_vectors.py", line 151, in <module>
    main(args)
  File "/home/bc20/jingwei/topk/./exploration/eval/save_vectors.py", line 120, in main
    evaluator.evaluate(model)
  File "/home/bc20/anaconda3/envs/dejavu/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/bc20/jingwei/topk/./exploration/eval/save_vectors.py", line 57, in evaluate
    _ = model(batch).logits
  File "/home/bc20/anaconda3/envs/dejavu/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/bc20/anaconda3/envs/dejavu/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/bc20/anaconda3/envs/dejavu/lib/python3.11/site-packages/tensor_parallel/pretrained_model.py", line 76, in forward
    return self.wrapped_model(*args, **kwargs)
  File "/home/bc20/anaconda3/envs/dejavu/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/bc20/anaconda3/envs/dejavu/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/bc20/anaconda3/envs/dejavu/lib/python3.11/site-packages/tensor_parallel/tensor_parallel.py", line 159, in forward
    return parallel_apply(self.module_shards, inputs, kwargs_tup, self.devices)[self.output_device_index]
  File "/home/bc20/anaconda3/envs/dejavu/lib/python3.11/site-packages/torch/nn/parallel/parallel_apply.py", line 108, in parallel_apply
    output.reraise()
  File "/home/bc20/anaconda3/envs/dejavu/lib/python3.11/site-packages/torch/_utils.py", line 722, in reraise
    raise exception
AttributeError: Caught AttributeError in replica 0 on device 0.
Original Traceback (most recent call last):
  File "/home/bc20/anaconda3/envs/dejavu/lib/python3.11/site-packages/torch/nn/parallel/parallel_apply.py", line 83, in _worker
    output = module(*input, **kwargs)
  File "/home/bc20/anaconda3/envs/dejavu/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/bc20/anaconda3/envs/dejavu/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/bc20/anaconda3/envs/dejavu/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py", line 1141, in forward
    outputs = self.model(
  File "/home/bc20/anaconda3/envs/dejavu/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/bc20/anaconda3/envs/dejavu/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/bc20/anaconda3/envs/dejavu/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py", line 971, in forward
    next_cache = next_cache.to_legacy_cache()
AttributeError: 'tuple' object has no attribute 'to_legacy_cache'

Is there any insights to work this around? I don't want to downgrade the transformers library to 4.35, because I want to use the newest llama-3.1 model.

@lybbill
Copy link

lybbill commented Oct 2, 2024

Me too. Who can help?

@Dr-Left
Copy link
Author

Dr-Left commented Oct 8, 2024

huggingface/transformers#28003 (comment)
@lybbill I found a workaround and posted in another issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants