Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when running AudioWhisper inference #26

Open
setianke opened this issue Aug 12, 2024 · 1 comment
Open

Error when running AudioWhisper inference #26

setianke opened this issue Aug 12, 2024 · 1 comment

Comments

@setianke
Copy link

I did a fine-tune and training for audioWhisper but get following error when running the /generate endpoint.

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
ERROR:serve_model:Exception on /generate [POST]
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/flask/app.py", line 1473, in wsgi_app
    response = self.full_dispatch_request()
  File "/usr/local/lib/python3.10/dist-packages/flask/app.py", line 882, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/usr/local/lib/python3.10/dist-packages/flask/app.py", line 880, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/local/lib/python3.10/dist-packages/flask/app.py", line 865, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)  # type: ignore[no-any-return]
  File "/workspace/multi_token/scripts/serve_model.py", line 46, in generate
    output_ids = model.generate(
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 2024, in generate
    result = self._sample(
  File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 2982, in _sample
    outputs = self(**model_inputs, return_dict=True)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/multi_token/multi_token/language_models/mistral.py", line 85, in forward
    ) = self.prepare_inputs_labels_for_multimodal(
  File "/workspace/multi_token/multi_token/language_models/base_model.py", line 101, in prepare_inputs_labels_for_multimodal
    past_key_values is None
AssertionError: We shouldn't have cached keys if this is the first instruction pass

I am using

python scripts/serve_model.py \
    --model_name_or_path mistralai/Mistral-7B-Instruct-v0.1 \
    --model_lora_path sshh12/Mistral-7B-LoRA-AudioWhisper \
    --port 7860

I have set the model name to my pretrained checkpoint model without any change in error.

@sshh12
Copy link
Owner

sshh12 commented Aug 19, 2024

Hey! I'm not super sure what's happening here. Did it work with a model that was already trained in the model zoo from the readme?

Potentially try use_cache=False in the serving script.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants