Error while deserializing header: HeaderTooLarge #12492

lvjingax · 2024-12-04T07:14:23Z

This error occurs when executing the file minicpm. py

hkvision · 2024-12-05T02:09:03Z

Can you provide more information on this so that we can help you detect the issue? e.g. the exact file/command you run, more error stacks, what platform/hardware you run, etc. Thanks.

lvjingax · 2024-12-09T01:05:52Z

Sorry, this has already been resolved due to a model issue. However, I have encountered a new problem where the program was suddenly killed while running a Python script. Can you help me check?

(llm) root@localhost://home/lvjingang01/MiniCPM-V# python minicpm.py
/root/miniforge3/envs/llm/lib/python3.11/site-packages/transformers/deepspeed.py:23: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations
warnings.warn(
/root/miniforge3/envs/llm/lib/python3.11/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: ''If you don't plan on using image functionality from torchvision.io, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have libjpeg or libpng installed before building torchvision from source?
warn(
2024-12-06 21:02:00,642 - INFO - intel_extension_for_pytorch auto imported
2024-12-06 21:02:00,740 - INFO - vision_config is None, using default vision config
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 10.02it/s]
2024-12-06 21:02:02,143 - INFO - Converting the current model to sym_int4 format......
/root/miniforge3/envs/llm/lib/python3.11/site-packages/torch/nn/init.py:412: UserWarning: Initializing zero-element tensors is a no-op
warnings.warn("Initializing zero-element tensors is a no-op")
Killed

hkvision · 2024-12-09T02:01:28Z

Seems there's no obvious error from the log. Still you need to provide more information for us to locate the issue, e.g. the link or the content of minicpm.py, the exact minicpm model you use, what platform/hardware you run, etc. Thanks.

lvjingax · 2024-12-09T02:07:55Z

Seems there's no obvious error from the log. Still you need to provide more information for us to locate the issue, e.g. the link or the content of minicpm.py, the exact minicpm model you use, what platform/hardware you run, etc. Thanks.

This is the source code for minicpm.py
import os
import torch
from PIL import Image
from ipex_llm.transformers import AutoModel
#from transformers import AutoModel
from transformers import AutoTokenizer
import time

model_path = "/home/lvjingang01/git/MiniCPM-V-2_6/"
#model_path = "// MiniCPM-V-2_6/"
model = AutoModel.from_pretrained(model_path, trust_remote_code=True, load_in_low_bit="sym_int4") #"fp8",
# optimize_model=True, modules_to_not_convert=["vpm", "resampler"]

model = model.eval()
model = model.float()

#model = model.half() # /transformers/generation/utils.py", line 2415, in _sample
# next_tokens = torch.multinomial(probs, num_samples=1).squeeze(1)
# RuntimeError: probability tensor contains either inf, nan or element < 0
#model = model.bfloat16() #RuntimeError: unsupported dtype, only fp32 and fp16 are supported
model = model.to('xpu')

tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)

def run_minicpm(image_path,question):
image = Image.open(image_path).convert('RGB')
msgs = [{'role': 'user', 'content': question}]
torch.xpu.synchronize()
timeStart = time.time()

res = model.chat(
    image=image,
    msgs=msgs,
    tokenizer=tokenizer,
    sampling=True,
    stream=True,
    temperature=0.7,
)

timeFirstRecord = False

generated_text = ""
for new_text in res:
    if timeFirstRecord == False:
        torch.xpu.synchronize()
        timeFirst = time.time() - timeStart
        timeFirstRecord = True
    generated_text += new_text
    #print(new_text, flush=True, end='')

torch.xpu.synchronize()
timeCost = time.time() - timeStart
token_count_input = len(tokenizer.tokenize(question))
token_count_output = len(tokenizer.tokenize(generated_text))

ms_first_token = timeFirst# * 1000
ms_rest_token = (timeCost - timeFirst) / (token_count_output - 1+1e-8) * 1000
print("\ninput: ", question)
print("output: ", generated_text)
print("token count input: ", token_count_input)
print("token count output: ", token_count_output)
print("time cost(s): ", timeCost)
print("First token latency(s): ", ms_first_token)
print("After token latency(ms/token)", ms_rest_token)
print("output token/s: ", token_count_output/timeCost)
print("output char/s",len(generated_text)/timeCost)
print("******** image path = ",image_path)
print("_______________")

print(res)

print("Start predict")
#run_minicpm('./test_image/guo.png','What are in the image?')
#run_minicpm('./cat.JPG','这是什么品种的猫')
#run_minicpm('./dog.JPG','这是什么？')
#run_minicpm('./green.JPG','图片内是什么植物')
#run_minicpm('./umbrella.JPG','图内的文字是什么意思？')
#run_minicpm('./road.JPG','图片内是什么内容')
#run_minicpm('./tree.JPG','图片内是什么植物')

定义包含.jpg文件的目录

directory = 'Picture'

定义每个图片的提示

prompt = '请描述这张图片，危险吗？50字。'

for filename in os.listdir(directory):
if filename.endswith('.jpg'):
image_path = os.path.join(directory, filename)
print(f"查到的文件名: {filename}")
run_minicpm(image_path, prompt)

hkvision · 2024-12-09T02:42:08Z

We suppose the application gets killed due to OOM.
Please follow this example we provide to run minicpm-v-2_6: https://github.com/intel-analytics/ipex-llm/blob/main/python/llm/example/GPU/HuggingFace/Multimodal/MiniCPM-V-2_6/chat.py
More specifically, use model.half() for fp16 model and add modules_to_not_convert=["vpm", "resampler"] when loading the model. Please do not comment out these key code in our example code. Thanks.

hkvision added the user issue label Dec 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error while deserializing header: HeaderTooLarge #12492

Error while deserializing header: HeaderTooLarge #12492

lvjingax commented Dec 4, 2024

hkvision commented Dec 5, 2024

lvjingax commented Dec 9, 2024

hkvision commented Dec 9, 2024

lvjingax commented Dec 9, 2024

hkvision commented Dec 9, 2024

Error while deserializing header: HeaderTooLarge #12492

Error while deserializing header: HeaderTooLarge #12492

Comments

lvjingax commented Dec 4, 2024

hkvision commented Dec 5, 2024

lvjingax commented Dec 9, 2024

hkvision commented Dec 9, 2024

lvjingax commented Dec 9, 2024

print(res)

定义包含.jpg文件的目录

定义每个图片的提示

hkvision commented Dec 9, 2024