Compilade/faster session sizes #284

Nexesenex · 2024-08-08T03:00:45Z

No description provided.

…e31a4f6` (#8880) * Fix compilation issue in `vulkan-shaders-gen` e31a4f6 broke compilation on w64devkit. Including `algorithm` seems to fix that. * Guard it under `#ifdef _WIN32`

When using CMake to build with Vulkan support, compiling vulkan-shaders-gen fails due to missing a CMakeLists.txt specification to link vulkan-shaders-gen with the threading library, resulting in the following error. [5/172] Linking CXX executable bin/vulkan-shaders-gen FAILED: bin/vulkan-shaders-gen : && /usr/bin/c++ ggml/src/vulkan-shaders/CMakeFiles/vulkan-shaders-gen.dir/vulkan-shaders-gen.cpp.o -o bin/vulkan-shaders-gen && : ld: error: undefined symbol: pthread_create >>> referenced by vulkan-shaders-gen.cpp >>> ggml/src/vulkan-shaders/CMakeFiles/vulkan-shaders-gen.dir/vulkan-shaders-gen.cpp.o:(std::__1::__libcpp_thread_create[abi:se180100](pthread**, >>> void* (*)(void*), void*)) c++: error: linker command failed with exit code 1 (use -v to see invocation) [6/172] Generating build details from Git -- Found Git: /usr/local/bin/git (found version "2.45.2") ninja: build stopped: subcommand failed. Add the CMakeLists.txt specification to link vulkan-shaders-gen with the threading library and fix the above error. Fixes #8834

This commit updates the name of the executable in README.md from `simple` to `llama-simple`.

* server : add lora hotswap endpoint * handle lora_no_apply * fix build * updae docs * clean up struct def * fix build * add LoRA test * fix style

This commit updates the usage comment in quantize.cpp to reflect the new name of the executable, which is llama-quantize.

* Add support for getting cpu info on Windows for llama_bench * refactor --------- Co-authored-by: slaren <[email protected]>

* Updated device filter to depend on default_selector (fixes non-intel device issues) * Small related update to example/sycl Readme

* ggml-backend : fix async copy from CPU * cuda : more reliable async copy, fix stream used when the devices are the same

* make : use C compiler to build metal embed object * use rm + rmdir to avoid -r flag in rm

ggerganov and others added 15 commits August 6, 2024 11:48

contributing : add note about write access

0bf16de

[Vulkan] Fix compilation of vulkan-shaders-gen on w64devkit after `…

efda90c

…e31a4f6` (#8880) * Fix compilation issue in `vulkan-shaders-gen` e31a4f6 broke compilation on w64devkit. Including `algorithm` seems to fix that. * Guard it under `#ifdef _WIN32`

simple : update name of executable to llama-simple (#8885)

5f4dcb1

This commit updates the name of the executable in README.md from `simple` to `llama-simple`.

CUDA: fix padding logic for FP16/FP32 (#8884)

641f5dd

server : add lora hotswap endpoint (WIP) (#8857)

1e6f655

* server : add lora hotswap endpoint * handle lora_no_apply * fix build * updae docs * clean up struct def * fix build * add LoRA test * fix style

typo correction (#8891)

3195854

quantize : update usage comment in quantize.cpp (#8889)

725e3d9

This commit updates the usage comment in quantize.cpp to reflect the new name of the executable, which is llama-quantize.

llama-bench : add support for getting cpu info on Windows (#8824)

506122d

* Add support for getting cpu info on Windows for llama_bench * refactor --------- Co-authored-by: slaren <[email protected]>

CUDA/HIP: fix tests/test-backend-ops (#8896)

a8dbc6f

[SYCL] Updated SYCL device filtering (#8901)

0478174

* Updated device filter to depend on default_selector (fixes non-intel device issues) * Small related update to example/sycl Readme

ggml-backend : fix async copy from CPU (#8897)

be55695

* ggml-backend : fix async copy from CPU * cuda : more reliable async copy, fix stream used when the devices are the same

make : use C compiler to build metal embed object (#8899)

15fa07a

* make : use C compiler to build metal embed object * use rm + rmdir to avoid -r flag in rm

llama : avoid useless copies in dummy session writer

dca7ad8

llama : avoid double tensor copy when saving session to buffer

9329953

Nexesenex merged commit 06bff76 into Nexesenex:lcpp_pr_faster_session_size Aug 8, 2024
6 of 9 checks passed

github-actions bot added examples python server ggml SYCL labels Aug 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compilade/faster session sizes #284

Compilade/faster session sizes #284

Nexesenex commented Aug 8, 2024

Compilade/faster session sizes #284

Compilade/faster session sizes #284

Conversation

Nexesenex commented Aug 8, 2024