New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

[Visual Grounding] Using InternVL2.5-8B did not work well on the Ref-L4 dataset #868

Open

Liareee opened this issue Jan 17, 2025 · 1 comment

Liareee commented Jan 17, 2025

作者您好，

感谢您贡献的非常优秀的开源模型！
我目前正在尝试在更多Visual Grounding的数据集上测试InternVL2.5系列模型的效果。
我使用了#359 中建议的prompt和类似的eval代码实现（dynamic的图像处理）

目前我部署了InternVL2.5-8B，并调用尝试grounding效果，能够返回bbox，但并不准确，整体iou很低。
但直接使用您提供的eval/refcoco下的评测脚本是可以得到和论文中公开的类似的结果 0.87

不知道是否有其他人也遇到类似的问题？
请问问题可能能够如何解决？感谢

Collaborator

yuecao0119 commented Jan 20, 2025

你好，

方便提供一下你的eval代码吗？如果用eval/refcoco的评测脚本可以获得类似结果，可能是因为超参数设置问题，例如temperature、max-num等。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment