You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Nice work! I have a question related to the above one. In case we are dealing with a large scale corpus, it means the KV cache will likely be stored in remote storage, which would incur costs from network communication. In this case, do you still see clear win on inference speed-up?
It's not just the network cost, the more data you have the slower it will
be. It's not just the network cost, the more data you have the slower it
will be.
Xiaohua Yan ***@***.***>于2024年12月16日 周一12:37写道:
当文件数量多的时候,kv缓存查询的速度降低很多,有什么优化办法吗
The text was updated successfully, but these errors were encountered: