forked from ggerganov/llama.cpp
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
llama : add llama-simple-vision-mllama example (wip)
This commit adds a vision example that uses Llama 3.2 Vision Instruct to experiment with how a multi-modal cross-attention model works. The implementation is based on Ollama's multi-modal implementation with some modifications to make it work with the new Vision API. The motivation for this example is only to get some experience with multi-modal cross-attention and understand how it works. This is a bare minimum approach to get something working and see if it is something that is worth exploring further, but parts of this migth be useful on their own like the model conversion for example. This is a work in progress and there is currently an issue with the scheduler/graph computation causing the model to act weirdly.
- Loading branch information
Showing
38 changed files
with
15,998 additions
and
212 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Large diffs are not rendered by default.
Oops, something went wrong.
Oops, something went wrong.