You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I am try to run "contribution_visual_reps.py", but I could get the 'inpt = vllm.get_llm_input_embeds([prompt], [img])' due to
`
5. Fill the embeddings corresponding to the images. Anything that is still zeros needs filling
image_to_overwrite = torch.all(final_embedding == 0, dim=-1)
image_to_overwrite &= image_to_overwrite.cumsum(-1) - 1 >= nb_image_pad[:, None].to(target_device)
if image_to_overwrite.sum() != image_features.shape[:-1].numel():
raise ValueError(
f"The input provided to the model are wrong. The number of image tokens is {torch.sum(special_image_token_mask)} while"
f" the number of image given to the model is {num_images}. This prevents correct indexing and breaks batch generation."
)
`
And image_to_overwrite.sum() Out[2]: tensor(331776, device='cuda:0')
image_features.shape[:-1].numel() Out[3]: 576
Did you meet the same question? And what is the version of transformer you use. Mine is 4.46.2.
The text was updated successfully, but these errors were encountered:
I wrap all VLLMs in a shell based on BaseVLLMForEdit to share various functions required for VLLM editing. The get_llm_input_embeds function needs to be specifically implemented for each VLLM, taking [prompt] and [img] as inputs and outputting the corresponding embeddings, which can then be directly fed into the language transformer in the VLLM. In contribution_visual_reps.py, I used LLaVA as an example. Therefore, I suggest you debug the get_llm_input_embeds function I wrote for LLaVA, which is located in editor/vllms_for_edit/llava/llava.py.
Hi, I am try to run "contribution_visual_reps.py", but I could get the 'inpt = vllm.get_llm_input_embeds([prompt], [img])' due to
`
5. Fill the embeddings corresponding to the images. Anything that is still zeros needs filling
`
And
image_to_overwrite.sum() Out[2]: tensor(331776, device='cuda:0')
image_features.shape[:-1].numel() Out[3]: 576
Did you meet the same question? And what is the version of transformer you use. Mine is 4.46.2.
The text was updated successfully, but these errors were encountered: