forked from LostRuins/koboldcpp
-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cuda iq opt 3 #196
Merged
Nexesenex
merged 31 commits into
Nexesenex:MMVQ_refactot
from
JohannesGaessler:cuda-iq-opt-3
Jun 30, 2024
Merged
Cuda iq opt 3 #196
Nexesenex
merged 31 commits into
Nexesenex:MMVQ_refactot
from
JohannesGaessler:cuda-iq-opt-3
Jun 30, 2024
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Owner
Nexesenex
commented
Jun 30, 2024
- I have read the contributing guidelines
- Self-reported review complexity:
- Low
- Medium
- High
* scripts : update sync [no ci] * files : relocate [no ci] * ci : disable kompute build [no ci] * cmake : fixes [no ci] * server : fix mingw build ggml-ci * cmake : minor [no ci] * cmake : link math library [no ci] * cmake : build normal ggml library (not object library) [no ci] * cmake : fix kompute build ggml-ci * make,cmake : fix LLAMA_CUDA + replace GGML_CDEF_PRIVATE ggml-ci * move public backend headers to the public include directory (ggerganov#8122) * move public backend headers to the public include directory * nix test * spm : fix metal header --------- Co-authored-by: Georgi Gerganov <[email protected]> * scripts : fix sync paths [no ci] * scripts : sync ggml-blas.h [no ci] --------- Co-authored-by: slaren <[email protected]>
* clip : suppress unused variable warnings This commit suppresses unused variable warnings for the variables e in the catch blocks. The motivation for this change is to suppress the warnings that are generated on Windows when using the MSVC compiler. The warnings are not displayed when using GCC because GCC will mark all catch parameters as used. Signed-off-by: Daniel Bevenius <[email protected]> * squash! clip : suppress unused variable warnings Remove e (/*e*/) instead instead of using GGML_UNUSED. --------- Signed-off-by: Daniel Bevenius <[email protected]>
…nov#8145) - Path seems to be wrong for the common.h header file in llama-android.cpp file. Fixing the path so the Android Build doesn't fail with the error "There is no file common/common.h"
* account for space prefix character * use find instead
Co-authored-by: kustaaya <[email protected]>
* Add Qwen2MoE 57B-A14B * Add Qwen2MoE 57B-A14B
* Delete examples/llama.android/llama/CMakeLists.txt ggerganov#8145 (comment) This file is not being used for building on Android. `llama.cpp/examples/llama.android/llama/src/main/cpp/CMakeLists.txt` is being used instead. * Update CMakeLists.txt Pick local llama.cpp files instead of fetching content from git
* Fixed leak in llama_control_vector_load_one() and allow llama_control_vector_load() to grow * refactored `llama_control_vector_load_one()` * allow multiple directions for same layer in same file * llama_control_vector_load_one() and llama_control_vector_load() now break on error * removed unnecessary ggml_free() call
Flake lock file updates: • Updated input 'nixpkgs': 'github:NixOS/nixpkgs/e9ee548d90ff586a6471b4ae80ae9cfcbceb3420?narHash=sha256-4Zu0RYRcAY/VWuu6awwq4opuiD//ahpc2aFHg2CWqFY%3D' (2024-06-13) → 'github:NixOS/nixpkgs/d603719ec6e294f034936c0d0dc06f689d91b6c3?narHash=sha256-k3JqJrkdoYwE3fHE6xGDY676AYmyh4U2Zw%2B0Bwe5DLU%3D' (2024-06-20) Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Philip Taron <[email protected]>
* add chatml fallback for cpp `llama_chat_apply_template` * remove redundant code
* cmake : fix deprecated option names not working * remove LlAMA_OPENMP
* CI: fix release build (Ubuntu) PR ggerganov#8006 changes defaults to build shared libs. However, CI for releases expects static builds. * CI: fix release build (Mac) --------- Co-authored-by: loonerin <[email protected]>
…perties (ggerganov#8132) * json: update grammars/README * mention broken prefixItems * add mention to llama-gbnf-validator * json: explicit type: object for nested items object in cli example
* Inference support for Gemma 2 model family * Update convert-hf-to-gguf.py, constants, and tensor mappings * cleanup * format fix * Fix special token vocab bug * Don't add space prefix * fix deleted lines * Update src/llama.cpp Co-authored-by: slaren <[email protected]> * Add model type names * Add control vector * Fix model type identification --------- Co-authored-by: Andrei Betlen <[email protected]> Co-authored-by: slaren <[email protected]>
…rn escapes (ggerganov#8180) * json: expand ESCAPED_IN_REGEXPS_BUT_NOT_IN_LITERALS charset * json: revert default of additionalProperties to false * Update README.md
* add --spm-infill option * support --spm-infill * support --spm-infill
…emplate_internal` (ggerganov#8172) * tmp_contains * minicpm chat template * add DeepSeek Lite template * change deepseek-lite to deepseek2 * correct code comment * correct code from master branch
github-actions
bot
added
documentation
Improvements or additions to documentation
Nvidia GPU
testing
examples
python
server
ggml
devops
SYCL
Vulkan
build
android
Kompute
script
Apple Metal
nix
labels
Jun 30, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.