Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inference got error "probability tensor contains either inf, nan or element < 0" while using beams=2 with temperature=0.1 #5

Open
zetavg opened this issue Apr 17, 2023 · 2 comments

Comments

@zetavg
Copy link
Owner

zetavg commented Apr 17, 2023

Using decapoda-research/llama-7b-hf, beams = 2 with temperature = 0.1:

Note: Error will not be shown if Stream Output is enabled. If Stream Output is enabled, it will just output nothing.

beams = 2 with temperature = 0.4 also got this error, however, beams = 2 with temperature = 0.5 will not.

(unhelpful-ai-v01-3)

@l0rinc
Copy link
Contributor

l0rinc commented Apr 21, 2023

Same on the trained LoRA model:

Traceback (most recent call last):
  File "/content/llama_lora/llama_lora/lib/streaming_generation_utils.py", line 47, in gentask
    ret = self.mfunc(callback=_callback, **self.kwargs)
  File "/content/llama_lora/llama_lora/lib/inference.py", line 59, in generate_with_callback
    generation_output = model.generate(**kwargs)
  File "/usr/local/lib/python3.9/dist-packages/peft/peft_model.py", line 631, in generate
    outputs = self.base_model.generate(**kwargs)
  File "/usr/local/lib/python3.9/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/transformers/generation/utils.py", line 1562, in generate
    return self.beam_sample(
  File "/usr/local/lib/python3.9/dist-packages/transformers/generation/utils.py", line 3187, in beam_sample
    next_tokens = torch.multinomial(probs, num_samples=2 * num_beams)
RuntimeError: probability tensor contains either `inf`, `nan` or element < 0
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/dist-packages/gradio/blocks.py", line 1020, in postprocess_data
    if predictions[i] is components._Keywords.FINISHED_ITERATING:
IndexError: tuple index out of range

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/dist-packages/gradio/routes.py", line 393, in run_predict
    output = await app.get_blocks().process_api(
  File "/usr/local/lib/python3.9/dist-packages/gradio/blocks.py", line 1111, in process_api
    data = self.postprocess_data(fn_index, result["prediction"], state)
  File "/usr/local/lib/python3.9/dist-packages/gradio/blocks.py", line 1024, in postprocess_data
    raise ValueError(
ValueError: Number of output components does not match number of values returned from from function do_inference

@zetavg
Copy link
Owner Author

zetavg commented Apr 25, 2023

@paplorinc I think that's another error that I forgot to update the return value at some point of the do_inference function, causing Gradio to error at some chances. I noticed this a few days ago and fixed it on the main branch. Let me know if it still happens for you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants