Replies: 1 comment
-
In particular, curious how I might be able to run https://huggingface.co/TheBloke/Spicyboros-70B-2.2-AWQ?not-for-all-audiences=true . Or, is the expectation that instead of providing the AWQ model, lightllm will quantize the model on the fly if --mode is provided? I think the docs can be clearer! |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi, trying to use GPTQ or GGUF quantized models with lightllm but a bit confused. Should I provide --mode int8weight as an argument?
Beta Was this translation helpful? Give feedback.
All reactions