Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

如何让agent去点击这个复选框,完成勾选? #56

Open
herist opened this issue Sep 9, 2024 · 2 comments
Open

如何让agent去点击这个复选框,完成勾选? #56

herist opened this issue Sep 9, 2024 · 2 comments

Comments

@herist
Copy link

herist commented Sep 9, 2024

image

尝试这样表达
"点击[我已阅读并同意]左边的复选框,完成勾选,"
但是ocr识别似乎是以“点”为主,始终点不到这个复选框

image

@junyangwang0410
Copy link
Collaborator

你可以在add_info中加入这样的描述:你需要勾选复选框,为了完成这个操作,你需要点击与“立即注册”同纵坐标、与“《用户服务协议》”同横坐标的位置

这个框因为特征不明显,不容易被检测出或者形象地描述出来,因此属于比较无解的case,不过可以尝试通过“参照物”的方法来解决,即找到容易被定位的位置,然后通过关联性让模型推理出需要点击的位置的坐标

@dl-robert
Copy link

我测试的也是这样,开启反思,重试多次后它能点到

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants