proposal: automatic output file naming #80

anthonywu · 2024-10-18T19:33:27Z

Currently the pattern for automatic image naming is image.* and the subsequent images are smartly indexed as image-N.* etc.

However, I think we can make some improvements:

always add the dimensions to the image name like image-512x512.png
automatically summarize the prompt into a file name - so if you had a prompt about a dog playing in a park, the filename can be summarized by some tool as dog-play-in-park.*. The tool can be something traditional/fast like nltk or some other modern embedding/ranking tool for finding the most relevant keywords.
allow users to opt-in to automatic placeholders such as

{iso_date} ISO date yyyy-mm-dd`
{unix_timestamp} via date +%s or python str fmt
{seed}
other placeholders that represent the various params: guidance, quantize, etc

not trying to scope creep on what we support - just enough that every file name can be reasonably expected to be unique (seed and prompt summary, at the minimum)

We can also make a API for a OutputFileNamer - we'll provide reasonable defaults, and any users of the library can inject their own customization as needed.

The text was updated successfully, but these errors were encountered:

azrahello · 2024-10-19T08:09:16Z

I would like to ask if it is possible to embed the prompt as EXIF data in a PNG file, so that when the image is imported into photographic software, the prompt appears as a description. In general, could the JSON created with the –metadata option also be embedded as EXIF data?

filipstrand · 2024-10-19T13:19:07Z

@azrahello We do export this information with the created png file since a while back (even if you do not include the -metadata flag). For example, for an image image.png generated by mflux, if you run

exiftool image.png

then you will see this kind of information (including the prompt):

File Type                       : PNG
File Type Extension             : png
MIME Type                       : image/png
Image Width                     : 1024
Image Height                    : 1024
Bit Depth                       : 8
Color Type                      : RGB
Compression                     : Deflate/Inflate
Filter                          : Adaptive
Interlace                       : Noninterlaced
Exif Byte Order                 : Big-endian (Motorola, MM)
Warning                         : Invalid EXIF text encoding for UserComment
User Comment                    : {'mflux_version': '0.2.1', 'model': 'schnell', 'seed': '1728244816', 'steps': '6', 'guidance': 'None', 'precision': 'mlx.core.bfloat16', 'quantization': 'None', 'generation_time': '610.10 seconds', 'lora_paths': 'None', 'lora_scales': 'None', 'prompt': 'blue bird', 'controlnet_image': 'None', 'controlnet_strength': 'None'}
Image Size                      : 1024x1024
Megapixels                      : 1.0

I did not spend too much time on the this feature, and there are probably better ways to structure this information so that it can be read by various image applications. E.g would be nice if it was shown in the info section in macOS etc (right now it is not shown)

filipstrand · 2024-10-19T13:21:41Z

@anthonywu I like your suggestions here

anthonywu · 2025-01-21T22:45:25Z

The intent here is to turn a long prompt paragraph into a few keywords for purposes of generating a file name from a user prompt.

After some research, I considered using the keybert library: https://github.com/MaartenGr/KeyBERT but here I will pause to consider the complexities that would be introduced as config expectations from maintainers and users:

choose an embedding model
if choosing a model such as sentence_transformer, then choose a non-default model from its model list - if we just choose a default but allow overrides, that adds several flags to clutter up the CLI interface
choose default output file max length, which informs how many "top N" keywords or key phrases we select
pip install of any of these models adds dependencies to mflux project, for example when pip install keybert we'd add joblib, scikit-learn, scipy, sentence-transformers, and threadpoolctl

Feedback and ideas welcome from the community here. Add some convenience at the cost of complexity?

gmorain · 2025-01-22T07:03:10Z

Hello @anthonywu

I fully understand your concern about the pulled dependencies. Let me just suggest an alternative solution that would potentially "externalise" the complexity and dependencies since naming the files is not really part of the generation process: as all the metadata (including the prompts) are embedded in the image, could that feature be just part of a secondary tool to "batch rename" the outputs (ie. iterating all the files in a folder, implementing your solution to extract keywords from the prompt recovered from the image metadata to rename the file)? It could become either another repository like the Streamlit frontend, or an optional component you could install with pip install mflux[renamer]. A little less user-friendly of course...

Best,
Gilles

anthonywu · 2025-01-22T18:02:34Z

I think an opt-in renamer would be a good option if metadata is available.

If the user did not elect to install the optional libraries the renamer can do a no-op and print a message.

Taking this further, using uv to run such a tool in its own venv is something I can explore. Perhaps this can be done in a single standalone .py script even.

anthonywu · 2025-01-28T19:12:04Z

Even though #120 satisfies some of this original request from myself, I'll keep this open because I think I can get the date/timestamp and the dimensions into the standard naming conventions as well as allow some third party programmatic control via output namer hooks.

anthonywu mentioned this issue Oct 19, 2024

support metadata files as CLI arg supplier #81

Merged

filipstrand added the enhancement New feature or request label Dec 27, 2024

anthonywu mentioned this issue Jan 22, 2025

Batch Image Renamer tool as an isolated uv run script #120

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

proposal: automatic output file naming #80

proposal: automatic output file naming #80

anthonywu commented Oct 18, 2024

azrahello commented Oct 19, 2024 •

edited

Loading

filipstrand commented Oct 19, 2024

filipstrand commented Oct 19, 2024

anthonywu commented Jan 21, 2025 •

edited

Loading

gmorain commented Jan 22, 2025

anthonywu commented Jan 22, 2025

anthonywu commented Jan 28, 2025

proposal: automatic output file naming #80

proposal: automatic output file naming #80

Comments

anthonywu commented Oct 18, 2024

azrahello commented Oct 19, 2024 • edited Loading

filipstrand commented Oct 19, 2024

filipstrand commented Oct 19, 2024

anthonywu commented Jan 21, 2025 • edited Loading

gmorain commented Jan 22, 2025

anthonywu commented Jan 22, 2025

anthonywu commented Jan 28, 2025

azrahello commented Oct 19, 2024 •

edited

Loading

anthonywu commented Jan 21, 2025 •

edited

Loading