-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vulkan: multithread pipeline creation #963
Conversation
Thank you! It works, shortens startup greatly. Though, does it suppose to use CPU and not GPU? |
* ggerganov/ggml#963 * If both `ggml-vulkan-shaders.cpp` and `ggml-vulkan-shaders.hpp` exist, `shaders-gen` will be skipped * Latest commits from llama.cpp
All compiles occur on the CPU, this just uses multiple threads to do them in parallel. |
Ping for review. |
cc @0cc4m |
src/ggml-vulkan.cpp
Outdated
@@ -1,3 +1,6 @@ | |||
// SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved. | |||
// SPDX-License-Identifier: MIT | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this copyright note is appropriate, as mentioned on #961.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
d07ccb8
to
6f374a5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works well on Linux. I can't test it on Windows, but I assume it should be fine?
@MaggotHATE you tested it on Windows, right?
I do most of my development on Windows, it works well there. |
Yes, I did, although I use w64devkit, if that's important. I don't have MSVS installed to test there. |
src/ggml-vulkan.cpp
Outdated
std::mutex compile_count_mutex; | ||
std::condition_variable compile_count_cond; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should be also static
?
Also, do these have to be global and not per-device?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since they define a thread amount limit based on the CPU, I think it's correct that they are global.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed them to static and left them global.
6f374a5
to
cbb2a28
Compare
No description provided.