Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FLUX.1 Tools | Fill #129

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

FLUX.1 Tools | Fill #129

wants to merge 1 commit into from

Conversation

filipstrand
Copy link
Owner

@filipstrand filipstrand commented Feb 22, 2025

Uses black-forest-labs/FLUX.1-Fill-dev as base model

Screenshot 2025-02-22 at 13 54 38

A bit related to #127 in the sense that the left hand side image is "masked-out" (but in that PR there is no official concept of a mask) and will not be altered by the denoising process. Will try to investigate if/how these can be unified...

Update: It is possible to achieve "in context learning" using an appropriate mask like so:
mask

However, this requires the dev-fill model and initial results are worse compared to the the approach in #127, (maybe because the adapters were trained with the dev-weights as base...?)

Probably best to keep #127 as a separate feature for now...

@filipstrand filipstrand self-assigned this Feb 22, 2025
@filipstrand filipstrand marked this pull request as draft February 22, 2025 12:58
@anthonywu
Copy link
Collaborator

Happy to see you take this on. I've wondered how to support this without either of:

  1. asking user to launch external workflow to provide the masked image - preferably using a tool that already ships with macOS i.e. Preview.app, but Preview app isn't very good at letting users get to the edit/draw mode with the right radius of brushes, then helping the user save the mask image but only save that layer.
  2. add in built-in image editor support to let users brush in the mask region, but this would require introducing a GUI framework into the generator CLI

I think this workflow could use another "tool" where the project demos how to do the mask (maybe even fork code from diffusers) but isn't officially packaged in the library.

@anthonywu
Copy link
Collaborator

My proposal for the obvious outpaint mode follow-up, probably separate PR. (no guarantee I have time to help but just dropping thoughts)

mflux-generate -m dev-fill --outpaint-padding [outpaint args]

where outpaint args can be these variants following CSS padding convention:

  1. e.g. 100px - pad all directions this much
  2. e.g. 20% - pad all directions this much, relative to original size
  3. e.g. 100px,200px - top/bottom 100px, right/left 200px
  4. e.g. 100px,200%,100%,400px - top/right/bottom/left (can mix both absolute and relative)

I considered this but want to not scope it in:

mflux-generate -m dev-fill --outpaint-output-size 1024x512

requires the user to do even more mental math on the size of the output image, I think any wrappers or GUIs can provide that functionality by calculating the absolute or relative padding

unlike inpainting, I like that outpointing does not require a separate mask image, it can be inferred from the padding numbers

@filipstrand
Copy link
Owner Author

@anthonywu True, I was thinking the along the same lines with a tool that could generate the mask... For the upcoming release I think it's fine to only ship the bare-bones and assume the user already has obtained the mask from somewhere else and add a helper tool for later.

Regarding your suggestion for expressing the padding options, I think an interface like that would be a nice fit!

I am also debating a bit with how integrated this feature should be with other ones, e.g if we should have separate "pipelines" like in diffusers (as we have done with generate.py and generate_controlnet.py) or somehow integrate this into the existing ones. I am leaning towards having this separate initially and then see how it fits with the rest of the features.

A bit longer term, but there are some cool 3rd party models/techniques that makes impressive use of in-painting and the in-context ability like catvton flux that would also be nice to support. I have been looking into this a bit briefly but still some work left to get this up and running.

@anthonywu
Copy link
Collaborator

anthonywu commented Mar 8, 2025

I have attempted a merge against the latest main from my fork's branch: inpaint-0.7.0 which builds on also my fork's cleanup-0.6.0 (see PR #138), maybe you can fast forward your work by taking my branch state and move forward as you had intended.

git remote add aw [email protected]:anthonywu/mflux.git
git fetch aw
# from your inpaint branch here
git reset --hard aw/inpaint-0.7.0

if you make progress after this I'll rebase on your canonical work in this upstream repo

@filipstrand
Copy link
Owner Author

I have attempted a merge against the latest main from my fork's branch: inpaint-0.7.0 which builds on also my fork's cleanup-0.6.0 (see PR #138), maybe you can fast forward your work by taking my branch state and move forward as you had intended.

Awesome, thanks for handling the conflicts and cleanup!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants