-
Notifications
You must be signed in to change notification settings - Fork 707
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
llama 3 support #155
Comments
There have been Llama 3 models uploaded: https://www.reddit.com/r/LocalLLaMA/comments/1c78wqk/4bit_prequantized_llama3_8b_bitsandbytes_uploaded/ Another repo integrated the model using an updated template: |
Thanks for the update. To get it to run on Mac, I put the equivalent code from run.sh into run-mac.sh:
and I made sure the paths in ui/types/openai.ts matched the paths in the docker*.yml file, some start with './', some don't. The model downloads and starts ok. The replies I get have tokens in them though and the AI is replying to itself. The following is after typing Hello:
It's like the text input isn't terminated with a tag it expects and it's going into a loop where it replies then feeds the reply back in as a query and prints a long conversation. According to this site, the chat models need a certain format: I was using the 7b code model before and that one works fine. |
Any news on the PR? :) |
The following app works well and doesn't need manual docker setup, it seems to use less memory overall and is really easy to install different models: |
Hello everyone!
Just wanted to create an issue so there is something to subscribe to. Hopefully there will be enough interest for it to be implemented.
Thank you for the project!
The text was updated successfully, but these errors were encountered: