From c8a9e8887b1cef1d4d0c4e58c613eb911f6da57d Mon Sep 17 00:00:00 2001 From: Eric Buehler Date: Wed, 2 Oct 2024 05:28:30 -0400 Subject: [PATCH] Add UQFF quant for Mistral Nemo 2407 --- README.md | 2 +- docs/UQFF.md | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 3b3d34ded..3b164840e 100644 --- a/README.md +++ b/README.md @@ -105,7 +105,7 @@ Mistal.rs supports several model categories: - [PagedAttention](docs/PAGED_ATTENTION.md) and continuous batching - Prefix caching - [Topology](docs/TOPOLOGY.md): Configure ISQ and device mapping easily -- [UQFF](docs/UQFF.md): The uniquely powerful quantized file format +- [UQFF](docs/UQFF.md): Quantized file format for easy mixing of quants, see some [models](docs/UQFF.md#list-of-models) which have already been converted. - Speculative Decoding: Mix supported models as the draft model or the target model - Dynamic LoRA adapter activation with adapter preloading: [examples and docs](docs/ADAPTER_MODELS.md#adapter-model-dynamic-adapter-activation) diff --git a/docs/UQFF.md b/docs/UQFF.md index aaeb6bf0c..7dfa4a30b 100644 --- a/docs/UQFF.md +++ b/docs/UQFF.md @@ -176,3 +176,4 @@ Have you created a UQFF model on Hugging Face? If so, please [create an issue](h | -- | -- | -- | | Phi 3.5 Mini Instruct | microsoft/Phi-3.5-mini-instruct | [EricB/Phi-3.5-mini-instruct-UQFF](EricB/Phi-3.5-mini-instruct-UQFF) | | Llama 3.2 Vision | meta-llama/Llama-3.2-11B-Vision-Instruct | [EricB/Llama-3.2-11B-Vision-Instruct-UQFF](https://huggingface.co/EricB/Llama-3.2-11B-Vision-Instruct-UQFF) | +| Mistral Nemo 2407 | mistralai/Mistral-Nemo-Instruct-2407 | [EricB/Mistral-Nemo-Instruct-2407-UQFF](https://huggingface.co/EricB/Mistral-Nemo-Instruct-2407-UQFF) |