support MiniCPM-V-2.6 #8967

tc-mb · 2024-08-10T12:02:14Z

Dear llama.cpp Official,

Hi, I'm writing to address our new PR submission for integrating our model MiniCPM-V 2.6 into llama.cpp. MiniCPM-V 2.6 is the latest and most capable model in the MiniCPM-V series. This model is stronger and supports multi-images understanding and video understanding.

This version of the model supports video understanding, and I have implemented functions such as video frame extraction in my fork version. However, because ffmpeg is introduced, there may be many environment and compilation issues in other devices. Therefore, I think it can be divided into multiple PR submissions.

This PR will first submit the modification of the model, and I hope it can be merged soon, so that the community can use MiniCPM-V 2.6 by GGUF first.
And in the later PR, support for video formats will be submitted, and we can spend more time discussing how llama.cpp can better integrate the function implementation of video understanding.

Best regards,
MiniCPM-V Official ^_^

Fixed Line

sync master

yorkane · 2024-08-12T15:08:33Z

waiting for merge

examples/llava/minicpmv-convert-image-encoder-to-gguf.py

HaishengLiang · 2024-08-15T07:06:08Z

waiting for merge

nanowell · 2024-08-15T16:09:57Z

waiting for merge

saket424 · 2024-08-17T20:27:33Z

I have opened an issue 9066 where I experienced a crash after this pull request was merged. The crash was unrelated to this miniCPM-V-2.6 model. I hope you can reproduce the error

tc-mb · 2024-08-19T05:15:38Z

I have opened an issue 9066 where I experienced a crash after this pull request was merged. The crash was unrelated to this miniCPM-V-2.6 model. I hope you can reproduce the error

Hello, I saw that the issue you mentioned was that llava would crash, but my update only involves the part of minicpmv. Although I am not sure about the issue problem, I feel that it may not be the problem with this branch.
Can you test whether this branch will also crash before being merged? Of course, if it is indeed a problem introduced by this PR, I will be very happy to help modify it.

saket424 · 2024-08-19T13:11:24Z

@tc-mb
The crash is not directly related to your miniCPM2.6 PR other than there is no crash before your PR and a crash after your PR owing to some uninitialized variables

Here is a PR that appears to fix the issue I reported
#9082

Sorry for the false alarm

tc-mb · 2024-08-19T13:25:29Z

@tc-mb The crash is not directly related to your miniCPM2.6 PR other than there is no crash before your PR and a crash after your PR owing to some uninitialized variables

Here is a PR that appears to fix the issue I reported #9082

Sorry for the false alarm

I'm glad your problem was solved.

x4080 · 2024-08-19T20:50:57Z

@tc-mb Can we use mini cpm with context cache ? So that we upload image once and ask for multiple question referring to the same image ?

tc-mb · 2024-08-20T03:07:19Z

@tc-mb Can we use mini cpm with context cache ? So that we upload image once and ask for multiple question referring to the same image ?

Yes, it's now storing cache.

You can run in interactive mode to ask multiple rounds of questions.

./llama-minicpmv-cli -m ../MiniCPM-V-2_6/model/ggml-model-Q4_K_M.gguf --mmproj ../MiniCPM-V-2_6/mmproj-model-f16.gguf -c 4096 --temp 0.7 --top-p 0.8 --top-k 100 --repeat-penalty 1.05 --image xx.jpg -i

or modify the minicpmv-cli function (which is more like an example) to achieve the functionality you want.

yizhangliu · 2024-08-20T11:06:11Z

Eagerly awaiting...

fairydreaming · 2024-08-20T19:16:49Z

examples/llava/minicpmv-convert-image-encoder-to-gguf.py

 if args.text_only:
    fname_middle = "text-"
    has_vision_encoder = False
 elif args.minicpmv_projector is not None:
    fname_middle = "mmproj-"
    has_text_encoder = False
    has_minicpmv_projector = True
+    minicpmv_version = 3


Is this line necessary? It overrides minicpmv_version value set in the command line when converting MiniCPM-V2.5 which results in a broken mmproj-model-f16.gguf.

x4080 · 2024-08-20T20:44:40Z

@tc-mb Can we use mini cpm with context cache ? So that we upload image once and ask for multiple question referring to the same image ?

Yes, it's now storing cache.

You can run in interactive mode to ask multiple rounds of questions.

./llama-minicpmv-cli -m ../MiniCPM-V-2_6/model/ggml-model-Q4_K_M.gguf --mmproj ../MiniCPM-V-2_6/mmproj-model-f16.gguf -c 4096 --temp 0.7 --top-p 0.8 --top-k 100 --repeat-penalty 1.05 --image xx.jpg -i

or modify the minicpmv-cli function (which is more like an example) to achieve the functionality you want.

cool, thats a great feature, thanks @tc-mb

dewarrn1 · 2024-08-23T02:36:47Z

Very cool! Are GPU operations supported at this time?

tc-mb · 2024-08-23T03:22:12Z

Very cool! Are GPU operations supported at this time?

I have tested in Ubuntu + Nvidia(4090), it is available and speed looks good. You can use it in the following way.

make LLAMA_CUDA=1
And add appropriate ngl parameters, such as.
./llama-minicpmv-cli -m ../MiniCPM-V-2_6/model/ggml-model-Q4_K_M.gguf --mmproj ../MiniCPM-V-2_6/mmproj-model-f16.gguf -c 4096 --temp 0.7 --top-p 0.8 --top-k 100 --repeat-penalty 1.05 --image xx.jpg -p "What is in the image?" -ngl 100

dewarrn1 · 2024-08-23T05:33:53Z

Awesome, thanks!

saket424 · 2024-08-25T15:53:27Z

@tc-mb
Can you give us the usage for how to serve up minicpm2.6 using llama-server so we can send it openai compatible chat completion requests with base64 encoded images

tc-mb · 2024-08-28T09:37:57Z

@tc-mb Can you give us the usage for how to serve up minicpm2.6 using llama-server so we can send it openai compatible chat completion requests with base64 encoded images

Sorry, I didn't test the server method when I updated it, I will support this capability in the near future.

* init * rename * add run android for termux in readme * add android readme * add instructions in readme * change name in readme * Update README.md * fixed line * add result in readme * random pos_embed * add positions index * change for ollama * change for ollama * better pos_embed in clip * support ollama * updata cmakelist * updata cmakelist * rename wrapper * clear code * replace and organize code * add link * sync master * fix warnings * fix warnings * fix bug in bicubic resize when need resize iamge smaller * receive review comments and modify * receive review comments and modify * put all code into llava dir * fix quality problem in pr code * change n_layer * add space in "-1" * imitate reshape bug of python code * fix bug in clip * fix issues for merging * fix llama-minicpmv-cli in cmake file * change pr readme * fix code review * remove in line 33 directory in the /cmakelists.txt (not in example, in the main dir * fix cmakefile * add warn * fix KEY_HAS_MINICPMV_PROJ * remove load_image_size into clip_ctx * remove the extern "C", MINICPMV_API * fix uhd code for review comment * delete minicpmv-wrapper in pr * remove uhd_image_embed * Modify 2 notes * support minicpmv2.6 * modify convert script of minicpmv * modify convert * modify convert * add readme * add resampler of v2.6 * modify clip * modify readme * fix type-check * fix type-check * fix type-check * fix type-check * modify convert script and readme * fix convert script and readme * fix convert * fix num in convert * fix type-check --------- Co-authored-by: Hongji Zhu <[email protected]> Co-authored-by: harvestingmoon <[email protected]>

apepkuss · 2024-11-19T05:04:15Z

@tc-mb Could you please provide the templating info in README-minicpmv2.6.md? Like the llava-cli templating and llava-1.6 prompting section in README-minicpmv2.6.md. It is necessary for practical usage to know how to organize the user question and the image. And also, whether or not the image should be converted to bytes or base64? Thanks!

* init * rename * add run android for termux in readme * add android readme * add instructions in readme * change name in readme * Update README.md * fixed line * add result in readme * random pos_embed * add positions index * change for ollama * change for ollama * better pos_embed in clip * support ollama * updata cmakelist * updata cmakelist * rename wrapper * clear code * replace and organize code * add link * sync master * fix warnings * fix warnings * fix bug in bicubic resize when need resize iamge smaller * receive review comments and modify * receive review comments and modify * put all code into llava dir * fix quality problem in pr code * change n_layer * add space in "-1" * imitate reshape bug of python code * fix bug in clip * fix issues for merging * fix llama-minicpmv-cli in cmake file * change pr readme * fix code review * remove in line 33 directory in the /cmakelists.txt (not in example, in the main dir * fix cmakefile * add warn * fix KEY_HAS_MINICPMV_PROJ * remove load_image_size into clip_ctx * remove the extern "C", MINICPMV_API * fix uhd code for review comment * delete minicpmv-wrapper in pr * remove uhd_image_embed * Modify 2 notes * support minicpmv2.6 * modify convert script of minicpmv * modify convert * modify convert * add readme * add resampler of v2.6 * modify clip * modify readme * fix type-check * fix type-check * fix type-check * fix type-check * modify convert script and readme * fix convert script and readme * fix convert * fix num in convert * fix type-check --------- Co-authored-by: Hongji Zhu <[email protected]> Co-authored-by: harvestingmoon <[email protected]>

tc-mb and others added 30 commits May 23, 2024 19:28

init

7a49a6f

rename

c536fa6

add run android for termux in readme

2b91903

add android readme

0480d5f

add instructions in readme

ec1cea7

change name in readme

a491f45

Update README.md

7573b63

fixed line

94dcaba

Merge pull request #1 from harvestingmoon/minicpm-v2.5

b31f51f

Fixed Line

add result in readme

629420e

random pos_embed

b48708a

add positions index

d9fbc1d

change for ollama

18fe620

change for ollama

2997a68

better pos_embed in clip

8541e99

support ollama

d8974b8

updata cmakelist

e73a0c7

updata cmakelist

6366d62

rename wrapper

056d178

clear code

3c306f1

replace and organize code

9495504

add link

b37ab0b

Merge branch 'prepare-PR-of-minicpm-v2.5' into prepare-PR

8767ce2

Merge pull request #7 from OpenBMB/prepare-PR

8bd47ce

sync master

Merge pull request #8 from OpenBMB/master

28d4a7f

sync master

sync master

02eb445

fix warnings

07f48f9

fix warnings

c38d152

fix bug in bicubic resize when need resize iamge smaller

88f5e6a

receive review comments and modify

a913ca4

tc-mb added 2 commits August 12, 2024 21:14

fix convert

f30c5e1

fix num in convert

47eb0a5

Vaibhavs10 reviewed Aug 12, 2024

View reviewed changes

examples/llava/minicpmv-convert-image-encoder-to-gguf.py Outdated Show resolved Hide resolved

fix type-check

1ca3f06

Vaibhavs10 requested a review from Galunid August 13, 2024 09:47

chigkim mentioned this pull request Aug 13, 2024

add MiniCPM-V-2_5 ollama/ollama#6307

Closed

Vaibhavs10 requested a review from ggerganov August 16, 2024 10:50

ggerganov merged commit d565bb2 into ggml-org:master Aug 16, 2024
54 checks passed

tc-mb deleted the prepare-PR-of-minicpm-v2.6 branch August 20, 2024 11:09

fairydreaming reviewed Aug 20, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support MiniCPM-V-2.6 #8967

support MiniCPM-V-2.6 #8967

tc-mb commented Aug 10, 2024

yorkane commented Aug 12, 2024

HaishengLiang commented Aug 15, 2024

nanowell commented Aug 15, 2024

saket424 commented Aug 17, 2024 •

edited

Loading

tc-mb commented Aug 19, 2024

saket424 commented Aug 19, 2024

tc-mb commented Aug 19, 2024

x4080 commented Aug 19, 2024

tc-mb commented Aug 20, 2024

yizhangliu commented Aug 20, 2024

fairydreaming Aug 20, 2024

x4080 commented Aug 20, 2024

dewarrn1 commented Aug 23, 2024

tc-mb commented Aug 23, 2024

dewarrn1 commented Aug 23, 2024

saket424 commented Aug 25, 2024

tc-mb commented Aug 28, 2024

apepkuss commented Nov 19, 2024

support MiniCPM-V-2.6 #8967

support MiniCPM-V-2.6 #8967

Conversation

tc-mb commented Aug 10, 2024

yorkane commented Aug 12, 2024

HaishengLiang commented Aug 15, 2024

nanowell commented Aug 15, 2024

waiting for merge

saket424 commented Aug 17, 2024 • edited Loading

tc-mb commented Aug 19, 2024

saket424 commented Aug 19, 2024

tc-mb commented Aug 19, 2024

x4080 commented Aug 19, 2024

tc-mb commented Aug 20, 2024

yizhangliu commented Aug 20, 2024

fairydreaming Aug 20, 2024

Choose a reason for hiding this comment

x4080 commented Aug 20, 2024

dewarrn1 commented Aug 23, 2024

tc-mb commented Aug 23, 2024

dewarrn1 commented Aug 23, 2024

saket424 commented Aug 25, 2024

tc-mb commented Aug 28, 2024

apepkuss commented Nov 19, 2024

saket424 commented Aug 17, 2024 •

edited

Loading