You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, nice work! I have a question about your 'object class' dimension. My understanding is that Grit can predict some detected objects and their corresponding captions, which might not exactly be the same in the original prompt. For example, the original prompt could be 'a cat', the captions of the detected objects could end up being 'an orange cat', 'grass', 'bench', .... How do I check the object ('a cat') is successfully detected ('an orange cat')? Is it using LLM or clip similarity?
The text was updated successfully, but these errors were encountered:
Hi, nice work! I have a question about your 'object class' dimension. My understanding is that Grit can predict some detected objects and their corresponding captions, which might not exactly be the same in the original prompt. For example, the original prompt could be 'a cat', the captions of the detected objects could end up being 'an orange cat', 'grass', 'bench', .... How do I check the object ('a cat') is successfully detected ('an orange cat')? Is it using LLM or clip similarity?
The text was updated successfully, but these errors were encountered: