b3267 #200

Nexesenex · 2024-06-30T21:11:04Z

I have read the contributing guidelines
Self-reported review complexity:
- Low
- Medium
- High

* clip : suppress unused variable warnings This commit suppresses unused variable warnings for the variables e in the catch blocks. The motivation for this change is to suppress the warnings that are generated on Windows when using the MSVC compiler. The warnings are not displayed when using GCC because GCC will mark all catch parameters as used. Signed-off-by: Daniel Bevenius <[email protected]> * squash! clip : suppress unused variable warnings Remove e (/*e*/) instead instead of using GGML_UNUSED. --------- Signed-off-by: Daniel Bevenius <[email protected]>

- Path seems to be wrong for the common.h header file in llama-android.cpp file. Fixing the path so the Android Build doesn't fail with the error "There is no file common/common.h"

* account for space prefix character * use find instead

Co-authored-by: kustaaya <[email protected]>

* Add Qwen2MoE 57B-A14B * Add Qwen2MoE 57B-A14B

* Delete examples/llama.android/llama/CMakeLists.txt #8145 (comment) This file is not being used for building on Android. `llama.cpp/examples/llama.android/llama/src/main/cpp/CMakeLists.txt` is being used instead. * Update CMakeLists.txt Pick local llama.cpp files instead of fetching content from git

* Fixed leak in llama_control_vector_load_one() and allow llama_control_vector_load() to grow * refactored `llama_control_vector_load_one()` * allow multiple directions for same layer in same file * llama_control_vector_load_one() and llama_control_vector_load() now break on error * removed unnecessary ggml_free() call

Flake lock file updates: • Updated input 'nixpkgs': 'github:NixOS/nixpkgs/e9ee548d90ff586a6471b4ae80ae9cfcbceb3420?narHash=sha256-4Zu0RYRcAY/VWuu6awwq4opuiD//ahpc2aFHg2CWqFY%3D' (2024-06-13) → 'github:NixOS/nixpkgs/d603719ec6e294f034936c0d0dc06f689d91b6c3?narHash=sha256-k3JqJrkdoYwE3fHE6xGDY676AYmyh4U2Zw%2B0Bwe5DLU%3D' (2024-06-20) Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Philip Taron <[email protected]>

* add chatml fallback for cpp `llama_chat_apply_template` * remove redundant code

* cmake : fix deprecated option names not working * remove LlAMA_OPENMP

* CI: fix release build (Ubuntu) PR #8006 changes defaults to build shared libs. However, CI for releases expects static builds. * CI: fix release build (Mac) --------- Co-authored-by: loonerin <[email protected]>

…perties (#8132) * json: update grammars/README * mention broken prefixItems * add mention to llama-gbnf-validator * json: explicit type: object for nested items object in cli example

* Inference support for Gemma 2 model family * Update convert-hf-to-gguf.py, constants, and tensor mappings * cleanup * format fix * Fix special token vocab bug * Don't add space prefix * fix deleted lines * Update src/llama.cpp Co-authored-by: slaren <[email protected]> * Add model type names * Add control vector * Fix model type identification --------- Co-authored-by: Andrei Betlen <[email protected]> Co-authored-by: slaren <[email protected]>

…rn escapes (#8180) * json: expand ESCAPED_IN_REGEXPS_BUT_NOT_IN_LITERALS charset * json: revert default of additionalProperties to false * Update README.md

* add --spm-infill option * support --spm-infill * support --spm-infill

…emplate_internal` (#8172) * tmp_contains * minicpm chat template * add DeepSeek Lite template * change deepseek-lite to deepseek2 * correct code comment * correct code from master branch

…tor to Gemma2 (#8197) * Add attention and final logit softcapping. * fix * Add custom add_ functions * Disable flash attention for Gemma2 * Update src/llama.cpp Co-authored-by: slaren <[email protected]> * Add default value for attention and final logit softcap value * Add custom kq scaling from Gemma2Attention * Remove custom pre attention scaling and use computed value instead. --------- Co-authored-by: slaren <[email protected]>

…x/suffix is set (#8203) * preserve new line llama_chat_format_single * disable chat template if in-prefix/suffix is set * remove redundant change

danbev and others added 23 commits June 27, 2024 01:50

Fix llama-android.cpp for error - "common/common.h not found" (#8145)

ac14662

- Path seems to be wrong for the common.h header file in llama-android.cpp file. Fixing the path so the Android Build doesn't fail with the error "There is no file common/common.h"

llama : fix CodeLlama FIM token checks (#8144)

911e35b

* account for space prefix character * use find instead

Added support for Viking pre-tokenizer (#8135)

f675b20

Co-authored-by: kustaaya <[email protected]>

CUDA: fix MMQ stream-k for --split-mode row (#8167)

85a267d

Add Qwen2MoE 57B-A14B model identifier (#8158)

6030c61

* Add Qwen2MoE 57B-A14B * Add Qwen2MoE 57B-A14B

Add chatml fallback for cpp llama_chat_apply_template (#8160)

16791b8

* add chatml fallback for cpp `llama_chat_apply_template` * remove redundant code

cmake : fix deprecated option names not working (#8171)

8172ee9

* cmake : fix deprecated option names not working * remove LlAMA_OPENMP

CI: fix release build (Ubuntu+Mac) (#8170)

558f44b

* CI: fix release build (Ubuntu) PR #8006 changes defaults to build shared libs. However, CI for releases expects static builds. * CI: fix release build (Mac) --------- Co-authored-by: loonerin <[email protected]>

json: update grammars/README w/ examples & note about additionalPro…

cb0b06a

…perties (#8132) * json: update grammars/README * mention broken prefixItems * add mention to llama-gbnf-validator * json: explicit type: object for nested items object in cli example

Add missing items in makefile (#8177)

a27aa50

json: restore default additionalProperties to false, fix some patte…

139cc62

…rn escapes (#8180) * json: expand ESCAPED_IN_REGEXPS_BUT_NOT_IN_LITERALS charset * json: revert default of additionalProperties to false * Update README.md

cmake : allow user to override default options (#8178)

b851b3f

Add SPM infill support (#8016)

38373cf

* add --spm-infill option * support --spm-infill * support --spm-infill

Add MiniCPM, Deepseek V2 chat template + clean up `llama_chat_apply_t…

26a39bb

…emplate_internal` (#8172) * tmp_contains * minicpm chat template * add DeepSeek Lite template * change deepseek-lite to deepseek2 * correct code comment * correct code from master branch

json: attempt to skip slow tests when running under emulator (#8189)

8748d8a

fix code typo in llama-cli (#8198)

72272b8

Fix new line issue with chat template, disable template when in-prefi…

9ef0780

…x/suffix is set (#8203) * preserve new line llama_chat_format_single * disable chat template if in-prefix/suffix is set * remove redundant change

github-actions bot added Nvidia GPU testing examples python server devops build labels Jun 30, 2024

github-actions bot added the android label Jun 30, 2024

Nexesenex merged commit 69ee8e2 into Nexesenex:spacestream Jun 30, 2024
48 of 56 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

b3267 #200

b3267 #200

Nexesenex commented Jun 30, 2024

b3267 #200

b3267 #200

Conversation

Nexesenex commented Jun 30, 2024