Skip to content

A gRPC server for llama.cpp - A Port of Facebook's LLaMA model in C/C++

Notifications You must be signed in to change notification settings

kherud/grpc-llama.cpp

Repository files navigation

The main goal of llama.cpp is to run the LLaMA model using 4-bit integer quantization on a MacBook. This repository provides a gRPC server for the library and proto files for client generation.

Installation

mkdir -p build
cd build
cmake ..  # other build args like -DLLAMA_CUBLAS=ON
cmake --build . --config Release
cd ..

Run

./grpc-server --help
./grpc-server --host 0.0.0.0 --port 50051 -m /path/to/model.gguf

About

A gRPC server for llama.cpp - A Port of Facebook's LLaMA model in C/C++

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published