Stable Diffusion Inference Overhaul #254
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
As per #251, this PR introduces a few fixes and improvements to the older Stable Diffusion Inference example. And most importantly, it makes the example work again.
s3://tensorizer
bucket in order to allow fast loading of the SD base models using Tensorizer.s3cmd
and further simplifies the example.To run the inference example as-is:
kubectl apply -f 02-inference-service.yaml
.To run the inference example with a custom serialized SD model:
s3cmd
, this can be done by runnings3cmd mb s3://YOURBUCKET
.00-optional-s3-secret.yaml
, you would replace each secret's placeholder with your base64 encoded keys. This can be done by runningecho -n "YOURKEYHERE" | base64"
for each key and the host url which is the S3 endpoint. Once this is done, you can install the secrets by runningkubectl apply -f 00-optional-s3-secret.yaml
.01-optional-s3-serialize-job.yaml
to replace--dest-bucket
with the bucket you are serializing to, and--hf-model-id
with the custom model you would like to serialize. Once this is done, you can run the job by runningkubectl apply -f 01-optional-s3-serialize-job.yaml
.02-inference-service.yaml
with the S3 URI pointing to your custom model. After that, you should be ready to start the service by runningkubectl apply -f 02-inference-service.yaml
.To test the inference endpoint:
ksvc
which can be found by listing the knative services by runningkubectl get ksvc
.curl -X POST 'http://sd.tenant-sta-amercurio-amercurio.knative.ord1.coreweave.cloud/generate' -H 'accept: application/json' -H 'Content-Type: application/json' -d '{"prompt": "a cat sleeping comfortably on a bed", "guidance_scale": 7, "num_inference_steps": 28, "seed": 42, "width": 768, "height": 512}' -o cat.png