Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relating to the recent paper about 'Self-guidance' method #29

Open
lindapu-1 opened this issue Jul 24, 2023 · 1 comment
Open

Relating to the recent paper about 'Self-guidance' method #29

lindapu-1 opened this issue Jul 24, 2023 · 1 comment

Comments

@lindapu-1
Copy link

Hello @bloc97,

Your work has been instrumental in my understanding of the topic, especially since I encountered some difficulties when trying to run the official prompt to prompt code.

Recently, I've been engrossed in a paper titled "Diffusion Self-Guidance for Controllable Image Generation" (https://dave.ml/selfguidance/), where the authors introduce a novel 'Self Guidance' method. This technique edits an image by manipulating the attention maps, and I notice its resemblance to the 'Prompt to Prompt' method.

As an undergraduate student eager to delve deeper into the realm of Computer Vision, I'm interested in implementing this 'Self Guidance' method for my project. However, as of now, the authors have not released their official code. Hence I'm considering implementing that self guidance method upon the foundation of your code.

Given your expertise in this area, I was wondering if you think it's feasible to implement the 'Self Guidance' method based on your code? Any insights or suggestions you could provide would be immensely appreciated.

@bloc97
Copy link
Owner

bloc97 commented Jul 27, 2023

Hi, after skimming through the paper I think it shouldn't be too difficult to implement the guidance operators described in the paper. If I understood it correctly, they apply transformations to the 1D intensity space f(x,y) and 2D coordinates space (x,y) -> (x',y') of the attention maps (eg. scaling, translation, etc.). In this repo, the attention maps are already exposed by the code.

For example, here it slices and replaces some attention maps by others in def new_sliced_attention:

if self.last_attn_slice_mask is not None:
    new_attn_slice = torch.index_select(self.last_attn_slice, -1, self.last_attn_slice_indices)
    attn_slice = attn_slice * (1 - self.last_attn_slice_mask) + new_attn_slice * self.last_attn_slice_mask

You could probably select a specific attn_slice that corresponds to an object and scale it to make the object bigger (the exact implementation details should follow the paper for best results).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants