Your daily, minimal build of llama.cpp. Also available on Docker Hub.
Source code: https://github.com/ggerganov/llama.cpp
Built from: https://github.com/EZForever/llama.cpp-static
Please refer to llama.cpp docker guide and server README.
tl;dr: Use server-ssl-avx2
if you don't know what you're doing.
Server images are tagged in the format of server-<ssl>-<avx>
.
<ssl>
is one of the following:
nossl
: Minimal build with no SSL/TLS capability.ssl
: Built with OpenSSL (LLAMA_SERVER_SSL=ON
), thus supports--ssl-key-file
and--ssl-cert-file
.
<avx>
is one of the following:
noavx
: All AVX-related optimizations are disabled. Do not use this build unless you are working around some known bug, or running LLMs on a 10-year-old potato.avx
: (Only) AVX instruction set is enabled. Might be useful if you are using some old CPUs that don't support AVX2.avx2
: AVX2 instruction set is enabled. This build should support most modern/recent CPUs with reasonable performance.avx512
: AVX512 base instruction set is enabled. Currently only some high-end or server-grade CPUs support this instruction set, so check your hardware specs before using this build.oneapi
: Experimental build with the Intel oneAPI compiler, inspired by ggerganov/llama.cpp#5067. Offers a ~30% speed increase (~20tok/s vs ~15tok/s) in prompt processing on my machine compared toavx2
builds. Not updated daily.
RPC server images are tagged in the format of rpc-server-<ssl>-<avx>
. Refer to rpc README for detailed information.