-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consistency changes #408
Consistency changes #408
Conversation
PTAL @bmahabirbu I didn't test the cuda changes FWIW, don't have the hardware |
I tested it and it crashed before loading the model with a segmentation fault. I originally thought it had something to do with the linker not updating the correct libggml.so file but its just actually do to the second cmakle --install line for whisper. It renames some llama stuff despite the mv commands causing issues |
container-images/cuda/Containerfile
Outdated
mv build/bin/main /usr/bin/whisper-main && mv build/bin/server /usr/bin/whisper-server && \ | ||
if [ -f build/lib/libwhisper.so ]; then mv build/lib/libwhisper.so /usr/lib/; fi && \ | ||
cmake --install build && \ | ||
mv build/bin/main ${INSTALL_PREFIX}/bin/whisper-main && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line 34
cmake --install build && \
Comment this out and we're good to go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line installs the libwhisper stuff:
$ ls /usr/lib64/libw*
/usr/lib64/libwhisper.so /usr/lib64/libwhisper.so.1 /usr/lib64/libwhisper.so.1.7.1
b1a89f7
to
db9501e
Compare
container-images/cuda/Containerfile
Outdated
@@ -46,16 +45,14 @@ ARG HUGGINGFACE_HUB_VERSION=0.26.2 | |||
ARG OMLMD_VERSION=0.1.6 | |||
|
|||
# Install minimal runtime dependencies | |||
RUN dnf install -y https://dl.fedoraproject.org/pub/epel/epel-release-latest-9.noarch.rpm && \ | |||
dnf install -y python3 python3-pip && dnf clean all && rm -rf /var/cache/*dnf* | |||
RUN dnf install -y python3 python3-pip nvidia-driver-cuda-libs && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to add this (nvidia-driver-cuda-libs) for this lib:
/lib64/libcuda.so.1
$ ldd /usr/bin/llama-cli
linux-vdso.so.1 (0x00007f219dc64000)
libllama.so => /lib64/libllama.so (0x00007f219dae4000)
libggml.so => /lib64/libggml.so (0x00007f2188000000)
libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007f2187c00000)
libm.so.6 => /lib64/libm.so.6 (0x00007f219da09000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f219d9ee000)
libc.so.6 => /lib64/libc.so.6 (0x00007f2187800000)
/lib64/ld-linux-x86-64.so.2 (0x00007f219dc66000)
libcudart.so.12 => /usr/local/cuda/targets/x86_64-linux/lib/libcudart.so.12 (0x00007f2187400000)
libcublas.so.12 => /usr/local/cuda/targets/x86_64-linux/lib/libcublas.so.12 (0x00007f2180a00000)
libcublasLt.so.12 => /usr/local/cuda/targets/x86_64-linux/lib/libcublasLt.so.12 (0x00007f215f000000)
libcuda.so.1 => /lib64/libcuda.so.1 (0x00007f215c000000)
libgomp.so.1 => /lib64/libgomp.so.1 (0x00007f219d9a5000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f219d99e000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f219d999000)
librt.so.1 => /lib64/librt.so.1 (0x00007f219d994000)
which makes me wonder, how was this working before without explicitly installing that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its due to cuda container toolkit passing the library from the nvidia driver on the host to the container. If you install the driver inside the container you'll get an error like this
Error: OCI runtime error: error executing hook `/usr/bin/nvidia-container-toolkit`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we remove nvidia-driver-cuda-libs? Is it breaking nvidia-container-toolkit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder can we CI some of this stuff even without Nvidia hardware in the CI runners
Yes I see what you mean, we have two different versions of /usr/lib64/libggml.so clashing |
We must somehow align them to the same version... |
We have at least three options to solve this I can think of, 1. split llama.cpp and whisper.cpp into two containerfiles Or...
Or...
|
db9501e
to
85ccafe
Compare
I fixed it by building whisper with -DBUILD_SHARED_LIBS=NO , so using static linking for whisper.cpp (but not llama.cpp) seems like an ok option for now. |
Nice, I'll give it a test run as well |
Confirmed working! (Had to remove the dnf install nvidia-driver-cuda-libs but that's just due to me setting my environment up with cuda container toolkit) |
Well lets remove it if that's the case, makes me wonder if somebody could run this without the nvidia-container-toolkit... But anyway don't want to break a working setup |
85ccafe
to
6a48e33
Compare
Sounds good to me! I'll investigate some more to see if there is a way to run it without the container toolkit |
f897510
to
e74c28e
Compare
Ensure llama.cpp version is the same accross containers. Removing some duplicate actions, etc. Signed-off-by: Eric Curtin <[email protected]>
e74c28e
to
7d657f8
Compare
PTAL @rhatdan this is ready. A lot of this PR is bringing consistency to build and install approaches in the various container images. One thing this fixes in llama.cpp and whisper.cpp is that they use different versions of libggml and we were getting mismatching ABI's sometimes if in a function call was in one but not the other. The second installed version of libggml would overwrite the first. We fixes this by using statically linking in libggml in the two whisper.cpp binaries so llama.cpp can use different versions of libggml like upstream llama.cpp/whisper.cpp intends. |
Ensure llama.cpp version is the same accross containers. Removing some duplicate actions, etc.