You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I wanted to express my gratitude to you again! Your work immensely inspired me.
I was wondering if you could kindly explain the relationship between the variables offset, batch_size, and len(feat)? What does offset do and why does batch_size == len(offset) - 1? Does len(feat) equal to batch_size?
Also, from your code I understand that BEiT3 can process a batch of image-text inputs, but SAM 2 does not support batch processing? (You used a for-loop.) For example, can SAM support parallel processing of:
one image input, a batch of N prompts that correspond to N different object, or
For the explanation of offset, take a look at #23 .
Based on the explanation of offset, you may understand that we use for-loop for inferencing SAM because items of feat has different shapes and can't be concated to a tensor for batch inference.
Dear authors,
I wanted to express my gratitude to you again! Your work immensely inspired me.
I was wondering if you could kindly explain the relationship between the variables
offset
,batch_size
, andlen(feat)
? What doesoffset
do and why doesbatch_size == len(offset) - 1
? Doeslen(feat)
equal tobatch_size
?Also, from your code I understand that BEiT3 can process a batch of image-text inputs, but SAM 2 does not support batch processing? (You used a for-loop.) For example, can SAM support parallel processing of:
The text was updated successfully, but these errors were encountered: