Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Latent-Interposer for SD15 and SDXL upscale models mixing #3

Open
DavideAlidosi opened this issue Jan 19, 2024 · 6 comments
Open

Comments

@DavideAlidosi
Copy link

First, congratulations on the work you have done on WebUI.
I would like to ask if it would be possible to integrate in the upscale options the Latent-Interposer system, already implemented on ComfyUI, which allows to combine the latent spaces of SD15 and SDXL.
The idea would be to take advantage of the increased capabilities of SDXL and then perform a refiner upscale with an SD15 model.
In the past few days I have already contacted city96, the author of the original node for Comfyui (city96/SD-Latent-Interposer#5), but I realized that I do not know enough about A1111 and Python to do the work myself.
Since you have already taken care of various aspects pertaining to this kind of functionality for A1111, I hope you will find it as interesting as I do.
Thank you.

@light-and-ray
Copy link

I think In the web-ui it can be used only in refiner. Can you imagine any other use cases? And is it better then just decode with sdxl vae, and re-encode it with 1.5 vae? Examples don't look like there is very good quality

@DavideAlidosi
Copy link
Author

Personally, and I think I am not the only one, I am really disappointed with the visual performance of SDXL, especially when using the Hires Fix, which consistently returns an extremely unpleasant noise floor. Of course, there remain the advantages associated with the increased amount of terms and the 1024px resolution, which I feel is a shame not to take advantage of.
On the SD15 side, the visual performance remains excellent even with Hires Fix. 4x, which however tends to create inconsistent images compared to the initial generation, perhaps at 512.
I honestly could not point to the best way to get the best of both worlds, from my knowledge the latent space conversion seems to be a good method, but I do not exclude that working on the VAE encoding may not bring improvements.
Clearly mine are only theoretical assumptions, in any case thank you for your attention.

@w-e-w
Copy link
Owner

w-e-w commented Jan 19, 2024

I have a look at webui code and I have concluded that implemented this as an extension will requires replacing the large portions of the hires fixed pipeline with bunch of different strange patches
doing so will make the implementation likely to break as webui updates, not to mention volatile when interacting with other extensions
so it's better off implementing directly in web UI not as an extension

@DavideAlidosi
Copy link
Author

Thank you very much for checking this, I will try to propose the change directly on the WebUI.

@w-e-w
Copy link
Owner

w-e-w commented Jan 19, 2024

note when I said

requires replacing the large portions of the hires fixed pipeline

I'm also factoring makeing it work with "decoding and encoding" converting method not just Latent-Interposer to allow support for all upscaller types
if you only wish to support Latent-Interposer then seems "more" achievable if you are willing to resort to ugly patches because it requires a smaller section of code to be patched and so it's "more" achievable as an extension

I think if this should be supported there should be no restriction of only using Latent-Interposer
and the part about ugly patches still stands still holds, it's better to implement it directly in web UI

@w-e-w
Copy link
Owner

w-e-w commented Jan 19, 2024

wait maybe it's actually more achievable than I think if you support
if it is just for Latent-Interposer
I think we only really need to patch that
https://github.com/AUTOMATIC1111/stable-diffusion-webui/blob/cb5b335acddd126d4f6c990982816c06beb0d6ae/modules/processing.py#L1317
and detect 1.5->xl or xl or 1.5


I might have a go (when I have time and no promises)
are you still kind of don't want to do it because if I want to support this I want to support everything and not just Latent-Interposer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants