Stable Diffusion Inference Overhaul #254

harubaru · 2023-09-14T00:48:52Z

As per #251, this PR introduces a few fixes and improvements to the older Stable Diffusion Inference example. And most importantly, it makes the example work again.

Updated Tensorizer support. Current Stable Diffusion models have been added to the public s3://tensorizer bucket in order to allow fast loading of the SD base models using Tensorizer.
Dropped PVC loading. Stable Diffusion models are only loaded through CoreWeave S3 object storage via Tensorizer. This simplifies the example significantly by not introducing an additional download and PVC creation step.
Replaced base image with CoreWeave Torch image. This enables faster deployments by simply using a smaller image with all of the prerequisite dependencies we need.
Dropped KServe dependency. KServe conflicts with Tensorizer so it was best to remove it entirely and replace it with FastAPI.
Simpler serialization & S3 upload example. Since Tensorizer has built-in support to push to S3 storage, this eliminates requiring s3cmd and further simplifies the example.
Single Docker image to rule them all. Instead of having multiple docker images for the serializer, s3 upload, downloader, and the inference service, they have been coalesced into one as they share the same dependencies.

To run the inference example as-is:

The inference service loads a public tensorized model by default, so it can be started by simply running kubectl apply -f 02-inference-service.yaml.

To run the inference example with a custom serialized SD model:

A S3 key must be generated through the cloud app, once this is done a bucket would have to be made. With s3cmd, this can be done by running s3cmd mb s3://YOURBUCKET.
Now, the S3 secrets have to be installed into Kubernetes. Under 00-optional-s3-secret.yaml, you would replace each secret's placeholder with your base64 encoded keys. This can be done by running echo -n "YOURKEYHERE" | base64" for each key and the host url which is the S3 endpoint. Once this is done, you can install the secrets by running kubectl apply -f 00-optional-s3-secret.yaml.
To serialize the model, you would have to modify the command arguments in 01-optional-s3-serialize-job.yaml to replace --dest-bucket with the bucket you are serializing to, and --hf-model-id with the custom model you would like to serialize. Once this is done, you can run the job by running kubectl apply -f 01-optional-s3-serialize-job.yaml.
To run the inference service, all you would have to do is replace the model URI in 02-inference-service.yaml with the S3 URI pointing to your custom model. After that, you should be ready to start the service by running kubectl apply -f 02-inference-service.yaml.

To test the inference endpoint:

You can run the command below, though replace the base of the URL with the link to your ksvc which can be found by listing the knative services by running kubectl get ksvc.
curl -X POST 'http://sd.tenant-sta-amercurio-amercurio.knative.ord1.coreweave.cloud/generate' -H 'accept: application/json' -H 'Content-Type: application/json' -d '{"prompt": "a cat sleeping comfortably on a bed", "guidance_scale": 7, "num_inference_steps": 28, "seed": 42, "width": 768, "height": 512}' -o cat.png

…ervice

online-inference/stable-diffusion/00-optional-s3-secret.yaml

rtalaricw

LGTM

rtalaricw · 2023-09-26T22:45:47Z

@harubaru Tested and works with custom/regular. Please merge when you can.

harubaru added 2 commits September 11, 2023 14:02

fix(sd-inference): use proper perms for model download job

0f75b0d

feat(sd-inference): add inference overhaul

eed9e1d

harubaru requested a review from wbrown September 14, 2023 00:48

harubaru added 5 commits September 14, 2023 08:09

fix(sd-inference): have secrets be optional

6566b37

feat(sd-inference): use kourier

9b2983f

fix(sd-inference): shorter domain name and kourier label

1f0b38a

refactor(sd-inference): transition from InferenceService to knative S…

fdcb6e9

…ervice

feat(sd-inference): update image

39cebe7

rtalaricw reviewed Sep 22, 2023

View reviewed changes

online-inference/stable-diffusion/00-optional-s3-secret.yaml Outdated Show resolved Hide resolved

rtalaricw approved these changes Sep 22, 2023

View reviewed changes

feat(sd-inference): replace s3 secret placeholders

eef9c62

harubaru merged commit f4a5946 into master Sep 26, 2023

harubaru deleted the amercurio/sd-inference-fix branch September 26, 2023 22:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stable Diffusion Inference Overhaul #254

Stable Diffusion Inference Overhaul #254

harubaru commented Sep 14, 2023 •

edited

Loading

rtalaricw left a comment

rtalaricw commented Sep 26, 2023

Stable Diffusion Inference Overhaul #254

Stable Diffusion Inference Overhaul #254

Conversation

harubaru commented Sep 14, 2023 • edited Loading

rtalaricw left a comment

Choose a reason for hiding this comment

rtalaricw commented Sep 26, 2023

harubaru commented Sep 14, 2023 •

edited

Loading