diff --git a/README.md b/README.md new file mode 100644 index 0000000..374ea03 --- /dev/null +++ b/README.md @@ -0,0 +1,56 @@ +# llama.node + +Node binding of [llama.cpp](https://github.com/ggerganov/llama.cpp). + +[llama.cpp](https://github.com/ggerganov/llama.cpp): Inference of [LLaMA](https://arxiv.org/abs/2302.13971) model in pure C/C++ + +## Installation + +```sh +npm install llama.node +``` + +## Usage + +```js +import { loadModel } from 'llama.node' + +// Initial a Llama context with the model (may take a while) +const context = loadModel({ + model: 'path/to/gguf/model', + use_mlock: true, + n_ctx: 2048, + n_gpu_layers: 1, // > 0: enable Metal on iOS + // embedding: true, // use embedding +}) + +// Do completion +const { text, timings } = await context.completion( + { + prompt: 'This is a conversation between user and llama, a friendly chatbot. respond in simple markdown.\n\nUser: Hello!\nLlama:', + n_predict: 100, + stop: ['', 'Llama:', 'User:'], + // n_threads: 4, + }, + (data) => { + // This is a partial completion callback + const { token } = data + }, +) +console.log('Result:', text) +``` + +## License + +MIT + +--- + +

+ + + +

+ Built and maintained by BRICKS. +

+