Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: phi 3.5 mini produces garbage past 4096 context #9127

Closed
patw opened this issue Aug 22, 2024 · 5 comments
Closed

Bug: phi 3.5 mini produces garbage past 4096 context #9127

patw opened this issue Aug 22, 2024 · 5 comments
Labels
bug-unconfirmed low severity Used to report low severity bugs in llama.cpp (e.g. cosmetic issues, non critical UI glitches) stale

Comments

@patw
Copy link

patw commented Aug 22, 2024

What happened?

Phi 3.5 mini doesn't produce <|end|> or <|endoftext|> when the context is set higher than 4096, just endless garbage tokens. Possible rope scale issue?

Name and Version

llama-server, recent compile

What operating system are you seeing the problem on?

No response

Relevant log output

No response

@patw patw added bug-unconfirmed low severity Used to report low severity bugs in llama.cpp (e.g. cosmetic issues, non critical UI glitches) labels Aug 22, 2024
@themanyone
Copy link

For conversation, the server is working fine with phi-3.5 quantized to 4 bits. But after a while it started outputting tons of blank lines and garbage when told to make a simple HTML page. Hitting the [Reset] button on the chat server's Gradio page, localhost:8080 fixed it for now. It makes great web pages.

The only thing I can guess is that unusual prompt formats from using other models corrupted the chat history somehow. But I have no way to look into the (now cleared) chat history to see. Will keep testing!

@bartowski1182
Copy link
Contributor

Are you using flash attention or not? I've seen that without flash attention the output is garbage, but with its coherent

@patw
Copy link
Author

patw commented Aug 26, 2024

I found with -fa turned on it was running super slow, but also outputting garbage. Right now it's off and stable at 4096.

@ThiloteE
Copy link

To do: Test, if fixed by #9396

@github-actions github-actions bot added the stale label Oct 11, 2024
Copy link
Contributor

This issue was closed because it has been inactive for 14 days since being marked as stale.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug-unconfirmed low severity Used to report low severity bugs in llama.cpp (e.g. cosmetic issues, non critical UI glitches) stale
Projects
None yet
Development

No branches or pull requests

4 participants