You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
How to improve the execution speed of OCR, grounding-dino, and chatgpt-4o models to transition mobile-agent from laboratory research to engineering use?
#50
Open
fredajiang opened this issue
Sep 4, 2024
· 1 comment
How to improve the execution speed of OCR, grounding-dino, and chatgpt-4o models to transition mobile-agent from laboratory research to engineering use?
I replaced the original grounding-dino model with a GPU-supported version, reducing the time required from about 7 seconds to just 0.2 seconds. For more details on the GPU version of grounding-dino, please refer to the link: https://github.com/IDEA-Research/GroundingDINO
For the OCR model, is there a similarly faster GPU-supported version? Currently, each OCR operation takes approximately 3 seconds.
For calling chatgpt-4o, do you have any suggestions for improving its execution speed? At present, each call to chatgpt-4o takes approximately 6-7 seconds.
Looking forward to your response.
The text was updated successfully, but these errors were encountered:
Hello. As you said, both the OCR model and GroundingDino can be loaded via GPU. For the OCR model, you need to install the corresponding version of tensorflow-gpu. There is currently no suitable method to reduce the call latency for VLLM. However, when deploying projects, we often use the agent parallel solution. For example, the reflection agent can run concurrently with the planning agent in the next stage. If the reflection agent believes that the operation is correct, the time of calling the planning agent can be reduced. Of course, if you can accept the decline in model performance, gpt-4o-mini is a good choice.
How to improve the execution speed of OCR, grounding-dino, and chatgpt-4o models to transition mobile-agent from laboratory research to engineering use?
Looking forward to your response.
The text was updated successfully, but these errors were encountered: