Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gguf: update README #663

Merged
merged 2 commits into from
May 13, 2024
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 40 additions & 0 deletions packages/gguf/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@ npm install @huggingface/gguf

## Usage

### Basic usage

```ts
import { GGMLQuantizationType, gguf } from "@huggingface/gguf";

Expand Down Expand Up @@ -56,6 +58,44 @@ console.log(tensorInfos);

```

### Reading a local file

```ts
// Reading a local file. (Not supported on browser)
const { metadata, tensorInfos } = await gguf(
'./my_model.gguf',
{ allowLocalFile: true },
);
```

### Strictly typed

By default, known fields in `metadata` are typed. This includes various fields found in [llama.cpp](https://github.com/ggerganov/llama.cpp), [whisper.cpp](https://github.com/ggerganov/whisper.cpp) and [ggml](https://github.com/ggerganov/ggml).

```ts
const { metadata, tensorInfos } = await gguf(URL_MODEL);

// Type check for model architecture at runtime
if (metadata["general.architecture"] === "llama") {

// "llama.attention.head_count" is a valid key for llama architecture
ngxson marked this conversation as resolved.
Show resolved Hide resolved
console.log(model["llama.attention.head_count"]);

// "mamba.ssm.conv_kernel" is an invalid key, because it requires model architecture to be mamba
console.log(model["mamba.ssm.conv_kernel"]); // error
}
```

### Disable strictly typed

Because GGUF format can be used to store tensors, we can technically use it for other usages. For example, storing [control vectors](https://github.com/ggerganov/llama.cpp/pull/5970), [lora weights](https://github.com/ggerganov/llama.cpp/pull/2632),...
ngxson marked this conversation as resolved.
Show resolved Hide resolved

In case you want to use your own GGUF metadata structure, you can disable strict type by casting the parse output to `GGUFParseOutput<{ strict: false }>`:
ngxson marked this conversation as resolved.
Show resolved Hide resolved

```ts
const { metadata, tensorInfos }: GGUFParseOutput<{ strict: false }> = await gguf(URL_LLAMA);
```

## Hugging Face Hub

The Hub supports all file formats and has built-in features for GGUF format.
Expand Down
Loading