You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using NPU to inference the MiniCPM_2_6 model, it was found that it divided python into four parallel processes for inferencing, which took up a lot of memory.
The use of GPU inference is single-process inference,
in the demo coding did not find the relevant setting of single-process code, and how to set it?
The text was updated successfully, but these errors were encountered:
For MiniCPM-V_2_6, we currently only supports running on NPU with multiple processes. We will update here for any updates if we further support this model running on a single process :)
When using NPU to inference the MiniCPM_2_6 model, it was found that it divided python into four parallel processes for inferencing, which took up a lot of memory.
The use of GPU inference is single-process inference,
in the demo coding did not find the relevant setting of single-process code, and how to set it?
The text was updated successfully, but these errors were encountered: