-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Evaluation code for benchmark #16
Comments
@iisxuwei , thanks for your interests! We will release the evaluation code for the benchmark very soon during this holiday season. Stay tunned! It is indeed unfair to compare with other methods that were evaluated on the full validation set, but unfortunately, we could not have enough quota to call GPT-4V. We did evaluate some methods on our samples and noted in our table. We are looking into how to setup a better evaluation pipeline for all methods. thanks, |
Hi,i'm wondering the some Metrics in the evaluation benchmark, like mIou and [email protected]. In the benchmark page, the prompt for REC and RES are same. And GPT's return is the mark number or a range of mark numbers. How do you compute the metrics according to the GPT's return? |
Hi! P.S. I am very confused about the results in RefCOCOg in the experimental results section (Table 2).
I hope you can reply to me as soon as possible. I am very interested in your work and would like to cite it. Thank you! |
Hi, I'm very interested in your work and would like to know if the evaluation code in the benchmark will be released? Additionally, is selecting only 100 images for the evaluation too few and potentially unfair, but it seems there's no alternative due to the API limitations of GPT-4V.
Looking forward to your reply.
The text was updated successfully, but these errors were encountered: