Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simple pipeline for build an image with a HuggingFace model id #2

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -158,3 +158,6 @@ cython_debug/
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/

bentofile-copy.yaml
_embedding_runnable.py
5 changes: 5 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -115,6 +115,11 @@ Possible next steps:
$ bentoml push sentence-embedding-svc:scyvqxrxlc4rduqj [or bentoml build --push]
```

You can also try the simplified build script
```bash
GPU=true HF_MODEL=BAAI/bge-small-zh-v1.5 bash simple_build.sh
```

# Production Deployment

BentoML provides a number of [deployment options](https://docs.bentoml.com/en/latest/concepts/deploy.html).
Expand Down
24 changes: 20 additions & 4 deletions import_model.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,24 @@
import bentoml
import fire
from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained('sentence-transformers/all-MiniLM-L6-v2')
model = AutoModel.from_pretrained('sentence-transformers/all-MiniLM-L6-v2')

bentoml.transformers.save_model("all-MiniLM-L6-v2", model)
bentoml.transformers.save_model("all-MiniLM-L6-v2-tokenizer", tokenizer)
def hf_to_bentoml(hf: str = "sentence-transformers/all-MiniLM-L6-v2",
model_name: str = None,
tokenizer_name: str = None):
tokenizer = AutoTokenizer.from_pretrained(hf)
model = AutoModel.from_pretrained(hf)

if not model_name:
model_name = hf.split("/")[1]

if not tokenizer_name:
tokenizer_name = f"{model_name}-tokenizer"

bentoml.transformers.save_model(model_name, model)
bentoml.transformers.save_model(tokenizer_name, tokenizer)
print(f"{model_name}")


if __name__ == '__main__':
fire.Fire(hf_to_bentoml)
1 change: 1 addition & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,4 @@ torch
transformers
bentoml
pydantic>2.0
fire
32 changes: 32 additions & 0 deletions simple_build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
#!/bin/bash
set -ex

HF_MODEL=${HF_MODEL:-"sentence-transformers/all-MiniLM-L6-v2"}
CUDA=${CUDA:-"11.6.2"}
GPU=${GPU:-"false"}
REPO=${REPO:-"ghcr.io"}

echo "📂 1. Loading model & tokenizer from HuggingFace into cache"
model=$(python import_model.py --hf "$HF_MODEL")

echo "🍱 2. Building Bento.."
if [ "$GPU" == "true" ];
then
VERSION="${model}-gpu"
cat bentofile-gpu.yaml | sed -e "s/all-MiniLM-L6-v2/$model/g" -e "s/11\.6\.2/$CUDA/g" > bentofile-copy.yaml
else
VERSION="$model"
cat bentofile.yaml | sed -e "s/all-MiniLM-L6-v2/$model/g" > bentofile-copy.yaml
fi

cp embedding_runnable.py _embedding_runnable.py
cat _embedding_runnable.py | sed -e "s/all-MiniLM-L6-v2/$model/g" > embedding_runnable.py
bentoml build . -f bentofile-copy.yaml --version "$VERSION" --force

echo "🐳 3. Containerizing Bento.."
bentoml containerize \
"sentence-embedding-svc:$VERSION" \
--opt label='org.opencontainers.image.source=https://github.com/bentoml/sentence-embedding-bento' \
--opt label='org.opencontainers.image.description="Sentence Embedding REST API Service"' \
--opt label='org.opencontainers.image.licenses="Apache-2.0"' \
-t "$REPO/bentoml/sentence-embedding-bento:$VERSION"