Consistency changes #408

ericcurtin · 2024-11-04T13:31:46Z

Ensure llama.cpp version is the same accross containers. Removing some duplicate actions, etc.

ericcurtin · 2024-11-04T13:33:11Z

PTAL @bmahabirbu I didn't test the cuda changes FWIW, don't have the hardware

bmahabirbu · 2024-11-04T20:17:49Z

I tested it and it crashed before loading the model with a segmentation fault. I originally thought it had something to do with the linker not updating the correct libggml.so file but its just actually do to the second cmakle --install line for whisper. It renames some llama stuff despite the mv commands causing issues

bmahabirbu · 2024-11-04T21:13:04Z

container-images/cuda/Containerfile

-    mv build/bin/main /usr/bin/whisper-main && mv build/bin/server /usr/bin/whisper-server && \
-    if [ -f build/lib/libwhisper.so ]; then mv build/lib/libwhisper.so /usr/lib/; fi && \
+    cmake --install build && \
+    mv build/bin/main ${INSTALL_PREFIX}/bin/whisper-main && \


Line 34

cmake --install build && \

Comment this out and we're good to go

This line installs the libwhisper stuff:

$ ls /usr/lib64/libw* /usr/lib64/libwhisper.so /usr/lib64/libwhisper.so.1 /usr/lib64/libwhisper.so.1.7.1

ericcurtin · 2024-11-04T21:44:58Z

container-images/cuda/Containerfile

@@ -46,16 +45,14 @@ ARG HUGGINGFACE_HUB_VERSION=0.26.2
 ARG OMLMD_VERSION=0.1.6

 # Install minimal runtime dependencies
-RUN dnf install -y https://dl.fedoraproject.org/pub/epel/epel-release-latest-9.noarch.rpm && \
-    dnf install -y python3 python3-pip && dnf clean all && rm -rf /var/cache/*dnf*
+RUN dnf install -y python3 python3-pip nvidia-driver-cuda-libs && \


I had to add this (nvidia-driver-cuda-libs) for this lib:

/lib64/libcuda.so.1

$ ldd /usr/bin/llama-cli linux-vdso.so.1 (0x00007f219dc64000) libllama.so => /lib64/libllama.so (0x00007f219dae4000) libggml.so => /lib64/libggml.so (0x00007f2188000000) libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007f2187c00000) libm.so.6 => /lib64/libm.so.6 (0x00007f219da09000) libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f219d9ee000) libc.so.6 => /lib64/libc.so.6 (0x00007f2187800000) /lib64/ld-linux-x86-64.so.2 (0x00007f219dc66000) libcudart.so.12 => /usr/local/cuda/targets/x86_64-linux/lib/libcudart.so.12 (0x00007f2187400000) libcublas.so.12 => /usr/local/cuda/targets/x86_64-linux/lib/libcublas.so.12 (0x00007f2180a00000) libcublasLt.so.12 => /usr/local/cuda/targets/x86_64-linux/lib/libcublasLt.so.12 (0x00007f215f000000) libcuda.so.1 => /lib64/libcuda.so.1 (0x00007f215c000000) libgomp.so.1 => /lib64/libgomp.so.1 (0x00007f219d9a5000) libdl.so.2 => /lib64/libdl.so.2 (0x00007f219d99e000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f219d999000) librt.so.1 => /lib64/librt.so.1 (0x00007f219d994000)

which makes me wonder, how was this working before without explicitly installing that.

Its due to cuda container toolkit passing the library from the nvidia driver on the host to the container. If you install the driver inside the container you'll get an error like this

Error: OCI runtime error: error executing hook `/usr/bin/nvidia-container-toolkit`

Should we remove nvidia-driver-cuda-libs? Is it breaking nvidia-container-toolkit?

I wonder can we CI some of this stuff even without Nvidia hardware in the CI runners

ericcurtin · 2024-11-04T22:14:14Z

I tested it and it crashed before loading the model with a segmentation fault. I originally thought it had something to do with the linker not updating the correct libggml.so file but its just actually do to the second cmakle --install line for whisper. It renames some llama stuff despite the mv commands causing issues

Yes I see what you mean, we have two different versions of /usr/lib64/libggml.so clashing

ericcurtin · 2024-11-04T22:15:10Z

We must somehow align them to the same version...

ericcurtin · 2024-11-04T22:20:38Z

We have at least three options to solve this I can think of, 1. split llama.cpp and whisper.cpp into two containerfiles

Or...

Ensure at build-time of the container images that llama.cpp and whisper.cpp versions are ABI compatible...

Or...

Statically link libggml.so into the whisper binaries

ericcurtin · 2024-11-04T22:29:02Z

I fixed it by building whisper with -DBUILD_SHARED_LIBS=NO , so using static linking for whisper.cpp (but not llama.cpp) seems like an ok option for now.

bmahabirbu · 2024-11-04T22:30:50Z

I fixed it by building whisper with -DBUILD_SHARED_LIBS=NO , so using static linking for whisper.cpp (but not llama.cpp) seems like an ok option for now.

Nice, I'll give it a test run as well

bmahabirbu · 2024-11-04T22:41:28Z

I fixed it by building whisper with -DBUILD_SHARED_LIBS=NO , so using static linking for whisper.cpp (but not llama.cpp) seems like an ok option for now.

Nice, I'll give it a test run as well

Confirmed working! (Had to remove the dnf install nvidia-driver-cuda-libs but that's just due to me setting my environment up with cuda container toolkit)

ericcurtin · 2024-11-04T22:48:55Z

I fixed it by building whisper with -DBUILD_SHARED_LIBS=NO , so using static linking for whisper.cpp (but not llama.cpp) seems like an ok option for now.

Nice, I'll give it a test run as well

Confirmed working! (Had to remove the dnf install nvidia-driver-cuda-libs but that's just due to me setting my environment up with cuda container toolkit)

Well lets remove it if that's the case, makes me wonder if somebody could run this without the nvidia-container-toolkit... But anyway don't want to break a working setup

bmahabirbu · 2024-11-04T22:50:40Z

I fixed it by building whisper with -DBUILD_SHARED_LIBS=NO , so using static linking for whisper.cpp (but not llama.cpp) seems like an ok option for now.

Nice, I'll give it a test run as well

Confirmed working! (Had to remove the dnf install nvidia-driver-cuda-libs but that's just due to me setting my environment up with cuda container toolkit)

Well lets remove it if that's the case, makes me wonder if somebody could run this without the nvidia-container-toolkit... But anyway don't want to break a working setup

Sounds good to me! I'll investigate some more to see if there is a way to run it without the container toolkit

Ensure llama.cpp version is the same accross containers. Removing some duplicate actions, etc. Signed-off-by: Eric Curtin <[email protected]>

ericcurtin · 2024-11-05T14:26:44Z

PTAL @rhatdan this is ready. A lot of this PR is bringing consistency to build and install approaches in the various container images.

One thing this fixes in llama.cpp and whisper.cpp is that they use different versions of libggml and we were getting mismatching ABI's sometimes if in a function call was in one but not the other. The second installed version of libggml would overwrite the first.

We fixes this by using statically linking in libggml in the two whisper.cpp binaries so llama.cpp can use different versions of libggml like upstream llama.cpp/whisper.cpp intends.

bmahabirbu reviewed Nov 4, 2024

View reviewed changes

ericcurtin force-pushed the container-consistency branch from b1a89f7 to db9501e Compare November 4, 2024 21:43

ericcurtin commented Nov 4, 2024

View reviewed changes

ericcurtin force-pushed the container-consistency branch from db9501e to 85ccafe Compare November 4, 2024 22:28

ericcurtin force-pushed the container-consistency branch from 85ccafe to 6a48e33 Compare November 4, 2024 22:49

ericcurtin force-pushed the container-consistency branch 7 times, most recently from f897510 to e74c28e Compare November 5, 2024 01:36

Consistency changes

7d657f8

Ensure llama.cpp version is the same accross containers. Removing some duplicate actions, etc. Signed-off-by: Eric Curtin <[email protected]>

ericcurtin force-pushed the container-consistency branch from e74c28e to 7d657f8 Compare November 5, 2024 03:48

rhatdan merged commit 0048ee3 into main Nov 5, 2024
12 checks passed

ericcurtin deleted the container-consistency branch November 5, 2024 14:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consistency changes #408

Consistency changes #408

ericcurtin commented Nov 4, 2024

ericcurtin commented Nov 4, 2024

bmahabirbu commented Nov 4, 2024 •

edited

Loading

bmahabirbu Nov 4, 2024 •

edited

Loading

ericcurtin Nov 4, 2024

ericcurtin Nov 4, 2024 •

edited

Loading

bmahabirbu Nov 4, 2024 •

edited

Loading

ericcurtin Nov 4, 2024

ericcurtin Nov 4, 2024

ericcurtin commented Nov 4, 2024

ericcurtin commented Nov 4, 2024

ericcurtin commented Nov 4, 2024 •

edited

Loading

ericcurtin commented Nov 4, 2024 •

edited

Loading

bmahabirbu commented Nov 4, 2024

bmahabirbu commented Nov 4, 2024

ericcurtin commented Nov 4, 2024

bmahabirbu commented Nov 4, 2024

ericcurtin commented Nov 5, 2024 •

edited

Loading

Consistency changes #408

Consistency changes #408

Conversation

ericcurtin commented Nov 4, 2024

ericcurtin commented Nov 4, 2024

bmahabirbu commented Nov 4, 2024 • edited Loading

bmahabirbu Nov 4, 2024 • edited Loading

Choose a reason for hiding this comment

ericcurtin Nov 4, 2024

Choose a reason for hiding this comment

ericcurtin Nov 4, 2024 • edited Loading

Choose a reason for hiding this comment

bmahabirbu Nov 4, 2024 • edited Loading

Choose a reason for hiding this comment

ericcurtin Nov 4, 2024

Choose a reason for hiding this comment

ericcurtin Nov 4, 2024

Choose a reason for hiding this comment

ericcurtin commented Nov 4, 2024

ericcurtin commented Nov 4, 2024

ericcurtin commented Nov 4, 2024 • edited Loading

ericcurtin commented Nov 4, 2024 • edited Loading

bmahabirbu commented Nov 4, 2024

bmahabirbu commented Nov 4, 2024

ericcurtin commented Nov 4, 2024

bmahabirbu commented Nov 4, 2024

ericcurtin commented Nov 5, 2024 • edited Loading

bmahabirbu commented Nov 4, 2024 •

edited

Loading

bmahabirbu Nov 4, 2024 •

edited

Loading

ericcurtin Nov 4, 2024 •

edited

Loading

bmahabirbu Nov 4, 2024 •

edited

Loading

ericcurtin commented Nov 4, 2024 •

edited

Loading

ericcurtin commented Nov 4, 2024 •

edited

Loading

ericcurtin commented Nov 5, 2024 •

edited

Loading