From 562e1f28a2e5c2a19534afb3796f20a40a648620 Mon Sep 17 00:00:00 2001 From: Artur Paniukov Date: Tue, 15 Oct 2024 11:58:56 +0400 Subject: [PATCH 1/3] Del Tokenizer Conversion Step --- demos/embeddings/README.md | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/demos/embeddings/README.md b/demos/embeddings/README.md index 153f035722..91d79333f0 100644 --- a/demos/embeddings/README.md +++ b/demos/embeddings/README.md @@ -29,10 +29,15 @@ pip3 install optimum-intel@git+https://github.com/huggingface/optimum-intel.git Run optimum-cli to download and quantize the model: ```bash cd demos/embeddings -convert_tokenizer -o models/gte-large-en-v1.5-tokenizer/1 Alibaba-NLP/gte-large-en-v1.5 -optimum-cli export openvino --disable-convert-tokenizer --model Alibaba-NLP/gte-large-en-v1.5 --task feature-extraction --weight-format int8 --trust-remote-code --library sentence_transformers models/gte-large-en-v1.5-embeddings/1 -rm models/gte-large-en-v1.5-embeddings/1/*.json models/gte-large-en-v1.5-embeddings/1/vocab.txt +optimum-cli export openvino --model Alibaba-NLP/gte-large-en-v1.5 --task feature-extraction --weight-format int8 --trust-remote-code --library sentence_transformers models/gte-large-en-v1.5-embeddings/1 +rm models/gte-large-en-v1.5-embeddings/1/*.json models/gte-large-en-v1.5-embeddings/1/vocab.txt ``` +Move the tokenizer to a separate folder to create an embedding pipeline: +```bash +mkdir -p models/gte-large-en-v1.5-tokenizer/1 +mv models/gte-large-en-v1.5-embeddings/*tokenizer.* -t models/gte-large-en-v1.5-tokenizer/1 +``` + > **Note** Change the `--weight-format` to quantize the model to `fp16`, `int8` or `int4` precision to reduce memory consumption and improve performance. You should have a model folder like below: From a66f1162ba09bade6aee27b4802e981845316279 Mon Sep 17 00:00:00 2001 From: Artur Paniukov Date: Tue, 15 Oct 2024 12:01:16 +0400 Subject: [PATCH 2/3] Fix dir structure --- demos/embeddings/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/demos/embeddings/README.md b/demos/embeddings/README.md index 91d79333f0..c6d41a717f 100644 --- a/demos/embeddings/README.md +++ b/demos/embeddings/README.md @@ -35,7 +35,7 @@ rm models/gte-large-en-v1.5-embeddings/1/*.json models/gte-large-en-v1.5-embeddi Move the tokenizer to a separate folder to create an embedding pipeline: ```bash mkdir -p models/gte-large-en-v1.5-tokenizer/1 -mv models/gte-large-en-v1.5-embeddings/*tokenizer.* -t models/gte-large-en-v1.5-tokenizer/1 +mv models/gte-large-en-v1.5-embeddings/1/*_tokenizer.* -t models/gte-large-en-v1.5-tokenizer/1 ``` > **Note** Change the `--weight-format` to quantize the model to `fp16`, `int8` or `int4` precision to reduce memory consumption and improve performance. From 66de6857fab2666b5a6733e23141d01413da7d9f Mon Sep 17 00:00:00 2001 From: Artur Paniukov Date: Tue, 15 Oct 2024 12:04:35 +0400 Subject: [PATCH 3/3] Move Note --- demos/embeddings/README.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/demos/embeddings/README.md b/demos/embeddings/README.md index c6d41a717f..c4d0f5a6fe 100644 --- a/demos/embeddings/README.md +++ b/demos/embeddings/README.md @@ -32,14 +32,15 @@ cd demos/embeddings optimum-cli export openvino --model Alibaba-NLP/gte-large-en-v1.5 --task feature-extraction --weight-format int8 --trust-remote-code --library sentence_transformers models/gte-large-en-v1.5-embeddings/1 rm models/gte-large-en-v1.5-embeddings/1/*.json models/gte-large-en-v1.5-embeddings/1/vocab.txt ``` + +> **Note** Change the `--weight-format` to quantize the model to `fp16`, `int8` or `int4` precision to reduce memory consumption and improve performance. + Move the tokenizer to a separate folder to create an embedding pipeline: ```bash mkdir -p models/gte-large-en-v1.5-tokenizer/1 mv models/gte-large-en-v1.5-embeddings/1/*_tokenizer.* -t models/gte-large-en-v1.5-tokenizer/1 ``` -> **Note** Change the `--weight-format` to quantize the model to `fp16`, `int8` or `int4` precision to reduce memory consumption and improve performance. - You should have a model folder like below: ```bash tree models/