Skip to content

Commit

Permalink
Enable vllm multimodal minicpm-v-2-6 (#12074)
Browse files Browse the repository at this point in the history
* enable minicpm-v-2-6

* add image_url readme
  • Loading branch information
hzjane authored Sep 13, 2024
1 parent a767438 commit d703e4f
Show file tree
Hide file tree
Showing 2 changed files with 35 additions and 0 deletions.
29 changes: 29 additions & 0 deletions python/llm/example/GPU/vLLM-Serving/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,35 @@ curl http://localhost:8000/v1/completions \
}' &
```

##### Image input

image input only supports [MiniCPM-V-2_6](https://huggingface.co/openbmb/MiniCPM-V-2_6)now.
```bash
curl http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "MiniCPM-V-2_6",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "图片里有什么?"
},
{
"type": "image_url",
"image_url": {
"url": "http://farm6.staticflickr.com/5268/5602445367_3504763978_z.jpg"
}
}
]
}
],
"max_tokens": 128
}'
```

#### Tensor parallel

> Note: We recommend to use docker for tensor parallel deployment.
Expand Down
6 changes: 6 additions & 0 deletions python/llm/src/ipex_llm/vllm/xpu/model_convert.py
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,12 @@ def _ipex_llm_load_model(self) -> None:
modules = ["35.mlp", "36.mlp", "37.mlp", "38.mlp", "39.mlp"]
else:
modules = None
if "minicpm" in self.model_config.model.lower():
modules = ["vpm", "resampler"]
# only for minicpm_2_6
if "minicpm-v" in self.model_config.model.lower():
from ipex_llm.transformers.models.minicpmv import merge_qkv
self.model.vpm.apply(merge_qkv)
optimize_model(self.model, low_bit=low_bit, torch_dtype=self.model_config.dtype,
modules_to_not_convert=modules)
self.model = self.model.to(device=self.device_config.device,
Expand Down

0 comments on commit d703e4f

Please sign in to comment.