llava-cli: format batch --image descriptions according to --template #8637
+27
−2
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem. Using
llava-cli --image 1.jpg --image 2.jpg...
batch mode generates several image descriptions in succession. Keeping the model in memory allows for faster generation. But the output format is not eminently useful. To wit, the image file name is not mentioned.Rationale. Expediency could have chosen a specific data output format, such as JSON. But keeping with llama.cpp's versatility with command-line options, it seemed reasonable to let the user specify their own data format.
Improvisation. This pull request introduces a optional
--template
argument to format output of bulk image descriptions. If--template
is not supplied, output is exactly as it was before, so this commit is atomic.Help screen. The following line is added to the
-h
help message.--template STRING output template replaces [image] and [description] with generated output
Prerequisites. For this example, we create a shell script,
describe.sh
, to launch any particular llava model and options (yours will be different).Next, we
cd
to a directory containing a few images. And demonstrate using this new--template
option!The
printf %q
outputs file names with spaces and special characters properly escaped. We could have usedfind
. Thenullglob
option toshopt
is necessary to preventbash
from causing errors. If no images are found matching [pattern], it tries to pass off the glob pattern itself as one of the images. So we turn that feature off.Photos are processed one by one, formatting the output according to the
template we provided. We now have adata
file that looks like this.We could have cleaned it up to make a proper HTML page. But tools like HTML
tidy
already exist for that.tidy -i -o album.html data
As you can see, the new
--template
feature makes the AI web creation much easier.