Question for "llava-hf/llava-1.5-7b-hf" #1

sev777 · 2025-01-06T14:13:24Z

Hi, I am try to run "contribution_visual_reps.py", but I could get the 'inpt = vllm.get_llm_input_embeds([prompt], [img])' due to
`

5. Fill the embeddings corresponding to the images. Anything that is still zeros needs filling

    image_to_overwrite = torch.all(final_embedding == 0, dim=-1)
    image_to_overwrite &= image_to_overwrite.cumsum(-1) - 1 >= nb_image_pad[:, None].to(target_device)

    if image_to_overwrite.sum() != image_features.shape[:-1].numel():
        raise ValueError(
            f"The input provided to the model are wrong. The number of image tokens is {torch.sum(special_image_token_mask)} while"
            f" the number of image given to the model is {num_images}. This prevents correct indexing and breaks batch generation."
        )

`

And
image_to_overwrite.sum() Out[2]: tensor(331776, device='cuda:0')

image_features.shape[:-1].numel() Out[3]: 576

Did you meet the same question? And what is the version of transformer you use. Mine is 4.46.2.

The text was updated successfully, but these errors were encountered:

qizhou000 · 2025-01-06T14:17:48Z

Hello! My Version: 4.43.0

sev777 · 2025-01-06T14:29:06Z

Hello! My Version: 4.43.0

Thanks, and what is the shape of ' image_to_overwrite', I got the shape of 1*331776. It looks incorrect,

qizhou000 · 2025-01-06T14:48:10Z

I wrap all VLLMs in a shell based on BaseVLLMForEdit to share various functions required for VLLM editing. The get_llm_input_embeds function needs to be specifically implemented for each VLLM, taking [prompt] and [img] as inputs and outputting the corresponding embeddings, which can then be directly fed into the language transformer in the VLLM. In contribution_visual_reps.py, I used LLaVA as an example. Therefore, I suggest you debug the get_llm_input_embeds function I wrote for LLaVA, which is located in editor/vllms_for_edit/llava/llava.py.

sev777 closed this as completed Jan 6, 2025

sev777 reopened this Jan 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question for "llava-hf/llava-1.5-7b-hf" #1

Question for "llava-hf/llava-1.5-7b-hf" #1

sev777 commented Jan 6, 2025

qizhou000 commented Jan 6, 2025

sev777 commented Jan 6, 2025

qizhou000 commented Jan 6, 2025

Question for "llava-hf/llava-1.5-7b-hf" #1

Question for "llava-hf/llava-1.5-7b-hf" #1

Comments

sev777 commented Jan 6, 2025

5. Fill the embeddings corresponding to the images. Anything that is still zeros needs filling

qizhou000 commented Jan 6, 2025

sev777 commented Jan 6, 2025

qizhou000 commented Jan 6, 2025