Text2ImagePipeline heterogenous compile #1768

RyanMetcalfeInt8 · 2025-02-19T14:35:41Z

To simplify creation of a heterogenous stable diffusion txt2image pipeline, this adds a new API to Text2ImagePipeline class:

/**
 * Compiles image generation pipeline for given devices for text encoding, denoising, and vae decoding.
 * @param text_encode_device A device to compile text encoder(s) with
 * @param denoise_device A device to compile denoiser (e.g. UNet, SD3 Transformer, etc.) with
 * @param vae_decode_device A device to compile VAE decoder(s) with
 * @param properties A map of properties which affect models compilation
 * @note If pipeline was compiled before, an exception is thrown.
 */
void compile(const std::string& text_encode_device,
             const std::string& denoise_device,
             const std::string& vae_decode_device,
             const ov::AnyMap& properties = {});

(Need some feedback here.. especially on if we technically need 3 sets of properties.. one per device?)

This API greatly simplifies heterogenous pipeline setup to this:

ov::genai::Text2ImagePipeline pipe(models_path);
pipe.reshape(1, width, height, pipe.get_generation_config().guidance_scale);
pipe.compile(text_encoder_device, unet_device, vae_decoder_device);

And so now with these changes, heterogenous stable diffusion sample can support all variants of stable diffusion (SD1.5, LCM, XL, SD3, etc.) with the same code. With the old method (creating sub-components and assembling pipeline object), it would have been difficult to achieve this.

With that said, this PR is tested and working with the following pipelines (with NPU running denoise):

SD1.5 / LCM
SDXL

TODO:

~~Add python bindings for the new API~~
~~Update python heterogenous sample~~
FUTURE WORK (outside the scope of this PR):
- Add support for SD3 (this will be separate PR)
  - In general, this requires fixes to this issue: Text2Image, Stable Diffusion 3: Explicit Reshape + Compile produces very different output on GPU openvino#29113
  - Also some weirdness in current reshape() path I need to figure out.
  - For NPU, this requires a 'batch 1' implementation for Transformer2D -- similar as we did for UNet.
- Add support for FLUX (this will be separate PR)
- Add equivalent API for IMAGE2IMAGE / INPAINTING (separate PR's)

…::reshape to Text2ImagePipeline::reshape

…ompile flow This makes the following flow work for Stable Diffusion XL: ov::genai::Text2ImagePipeline pipe(models_path); pipe.reshape(1, width, height, pipe.get_generation_config().guidance_scale); pipe.compile(text_encoder_device, unet_device, vae_decoder_device); This commit fixes a couple errors that were present. 1. The original StableDiffusionXLPipeline constructor used in this case called StableDiffusionPipeline(pipeline_type, root_dir), which itself made a call to initialize_generation_config(). I think the idea here is that StableDiffusionXLPipeline's implementation should be called, but since at this point the vtables haven't been completed, StableDiffusionPipeline's implementation is called, and an exception was thrown since it doesn't support the "StableDiffusionXLPipeline" class_name. This is kind of deeper problem which probably requires more in-depth refactor.. but for now I resolved this issue by avoiding that specific construction flow. 2. StableDiffusionXLPipeline's reshape implementation originally used batch_size_multiplier (typically set to 2) to reshape the text encoders. But in fact these should be fixes at 1.

…geneous sample to use it

samples/cpp/image_generation/heterogeneous_stable_diffusion.cpp

samples/python/image_generation/heterogeneous_stable_diffusion.py

src/cpp/include/openvino/genai/image_generation/text2image_pipeline.hpp

src/cpp/src/image_generation/stable_diffusion_pipeline.hpp

src/cpp/src/image_generation/stable_diffusion_xl_pipeline.hpp

…ce-specific properties

This reverts commit abbb114.

… false upon explicit reshape

…sers class

RyanMetcalfeInt8 · 2025-02-24T16:51:33Z

@ilya-lavrenov -- Thanks for the review! I pushed changes to address your comments.

src/cpp/include/openvino/genai/image_generation/text2image_pipeline.hpp

src/cpp/src/image_generation/stable_diffusion_xl_pipeline.hpp

src/python/py_image_generation_pipelines.cpp

samples/python/image_generation/heterogeneous_stable_diffusion.py

src/cpp/src/image_generation/stable_diffusion_xl_pipeline.hpp

ilya-lavrenov · 2025-02-25T07:49:42Z

Please, also ensure that .pyi files are up to date:

Once you built Python API, pyi files in source tree are automatically updated and you can commit those changes.

Co-authored-by: Ilya Lavrenov <[email protected]>

…eline.hpp Co-authored-by: Ilya Lavrenov <[email protected]>

…tween reshape() and infer()

RyanMetcalfeInt8 · 2025-02-26T01:00:33Z

Once you built Python API, pyi files in source tree are automatically updated and you can commit those changes.

I'm trying to figure out how this pyi generation works... In my case, I don't see the .pyi getting generated. Looking at the CMakeLists.txt, it seems like it only enables this if OpenVINODeveloperPackage_FOUND.

Seems like OpenVINODeveloperPackage is not found in my case (I am setting up OV env. by sourcing nightly package setupvars.bat). Do I need to build OpenVINO from source to get this to work? BTW, I'm on Windows also...

src/cpp/src/image_generation/models/clip_text_model.cpp

src/cpp/src/image_generation/stable_diffusion_xl_pipeline.hpp

samples/python/image_generation/heterogeneous_stable_diffusion.py

…odel

ilya-lavrenov · 2025-02-26T19:49:08Z

build_jenkins

RyanMetcalfeInt8 added 4 commits February 18, 2025 10:32

Text2ImagePipeline: Add a heterogenous variant of compile() API

0c34559

Text2ImagePipeline: Move config update within StableDiffusionPipeline…

1455f2d

…::reshape to Text2ImagePipeline::reshape

Merge branch 'master' into text_gen_hetero_compile

dee9ef2

RyanMetcalfeInt8 marked this pull request as draft February 19, 2025 14:35

github-actions bot added category: text to image Text 2 image pipeline category: samples GenAI samples category: GenAI C++ API Changes in GenAI C++ public headers labels Feb 19, 2025

Text2Image: Add python bindings for hetero compile API, update hetero…

ec77fbe

…geneous sample to use it

github-actions bot added the category: Python API Python API for GenAI label Feb 19, 2025

StableDiffusionXLPipeline: Simplify device-centric constructor

abbb114

RyanMetcalfeInt8 marked this pull request as ready for review February 20, 2025 19:11

ilya-lavrenov assigned ilya-lavrenov and likholat Feb 21, 2025

ilya-lavrenov added this to the 2025.1 milestone Feb 21, 2025

ilya-lavrenov reviewed Feb 22, 2025

View reviewed changes

RyanMetcalfeInt8 added 7 commits February 24, 2025 05:44

heterogenous_stable_diffusion sample: Add comment about usage of devi…

7d1a25c

…ce-specific properties

text2image_pipeline: Add templated overload of hetero compile API

ac6e061

Revert "StableDiffusionXLPipeline: Simplify device-centric constructor"

0a638b0

This reverts commit abbb114.

stable_diffusion_xl_pipeline.hpp: Fix indentation

a74581b

stable_diffusion_xl_pipeline: Force m_force_zeros_for_empty_prompt to…

93212de

… false upon explicit reshape

diffusion_pipeline: Move default implementation of compile() to diffu…

165f038

…sers class

Merge branch 'master' into text_gen_hetero_compile

03efd76

Merge branch 'master' into text_gen_hetero_compile

ff61cb6

ilya-lavrenov reviewed Feb 25, 2025

View reviewed changes

RyanMetcalfeInt8 and others added 3 commits February 25, 2025 08:07

Update src/cpp/src/image_generation/stable_diffusion_xl_pipeline.hpp

c7e68ff

Co-authored-by: Ilya Lavrenov <[email protected]>

Update src/python/py_image_generation_pipelines.cpp

6454039

Co-authored-by: Ilya Lavrenov <[email protected]>

Update src/cpp/include/openvino/genai/image_generation/text2image_pip…

29b690d

…eline.hpp Co-authored-by: Ilya Lavrenov <[email protected]>

RyanMetcalfeInt8 added 7 commits February 25, 2025 06:57

py_image_generation_pipelines: vae_decode_device -> vae_device

ce601fd

py_image_generation_pipelines: vae_decode_device -> vae_device

625f5e3

Merge branch 'master' into text_gen_hetero_compile

622ed7a

stable_diffusion_xl / clip: Properly handle mismatch in batch size be…

97d986c

…tween reshape() and infer()

Merge branch 'master' into text_gen_hetero_compile

195a7d8

Merge branch 'master' into text_gen_hetero_compile

961dcfe

image generation: vae_decode_device -> vae_device

6980f45

ilya-lavrenov reviewed Feb 26, 2025

View reviewed changes

RyanMetcalfeInt8 added 5 commits February 26, 2025 07:20

heterogeneous_stable_diffusion.py: remove ;'s

cb0dc1f

clip_text_model: Get partial shape from compiled_model, not runtime_m…

daa101e

…odel

clip_text_model: Add detail to asserts

facb56e

Merge branch 'master' into text_gen_hetero_compile

4d8c19c

update py_openvino_genai.pyi

7128ec1

ilya-lavrenov approved these changes Feb 26, 2025

View reviewed changes

ilya-lavrenov enabled auto-merge February 26, 2025 17:25

ilya-lavrenov disabled auto-merge February 26, 2025 19:51

ilya-lavrenov merged commit fae9029 into openvinotoolkit:master Feb 26, 2025
62 of 63 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Text2ImagePipeline heterogenous compile #1768

Text2ImagePipeline heterogenous compile #1768

RyanMetcalfeInt8 commented Feb 19, 2025 •

edited

Loading

RyanMetcalfeInt8 commented Feb 24, 2025

ilya-lavrenov commented Feb 25, 2025

RyanMetcalfeInt8 commented Feb 26, 2025 •

edited

Loading

ilya-lavrenov commented Feb 26, 2025

Text2ImagePipeline heterogenous compile #1768

Text2ImagePipeline heterogenous compile #1768

Conversation

RyanMetcalfeInt8 commented Feb 19, 2025 • edited Loading

RyanMetcalfeInt8 commented Feb 24, 2025

ilya-lavrenov commented Feb 25, 2025

RyanMetcalfeInt8 commented Feb 26, 2025 • edited Loading

ilya-lavrenov commented Feb 26, 2025

RyanMetcalfeInt8 commented Feb 19, 2025 •

edited

Loading

RyanMetcalfeInt8 commented Feb 26, 2025 •

edited

Loading