Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

b3284 #204

Merged
merged 25 commits into from
Jul 2, 2024
Merged

b3284 #204

merged 25 commits into from
Jul 2, 2024

Conversation

Nexesenex
Copy link
Owner

ochafik and others added 25 commits June 28, 2024 09:26
…rn escapes (#8180)

* json: expand ESCAPED_IN_REGEXPS_BUT_NOT_IN_LITERALS charset

* json: revert default of additionalProperties to false

* Update README.md
* add --spm-infill option

* support --spm-infill

* support --spm-infill
…emplate_internal` (#8172)

* tmp_contains

* minicpm chat template

* add DeepSeek Lite template

* change deepseek-lite to deepseek2

* correct code comment

* correct code from master branch
…tor to Gemma2 (#8197)

* Add attention and final logit softcapping.

* fix

* Add custom add_ functions

* Disable flash attention for Gemma2

* Update src/llama.cpp

Co-authored-by: slaren <[email protected]>

* Add default value for attention and final logit softcap value

* Add custom kq scaling from Gemma2Attention

* Remove custom pre attention scaling and use computed value instead.

---------

Co-authored-by: slaren <[email protected]>
…x/suffix is set (#8203)

* preserve new line llama_chat_format_single

* disable chat template if in-prefix/suffix is set

* remove redundant change
* align with rope.cu and move sycl-op to a single file
* Update README.md

document BERT support

* Update README.md
* nix : remove OpenCL remnants

* minor : remove parentheses
Co-authored-by: Georgi Gerganov <[email protected]>
* Added gppm to Tool list in README

* Update README.md

---------

Co-authored-by: Georgi Gerganov <[email protected]>
* gemma2: add sliding window mask

* fix data_swa uninitialized

* better naming

* add co-author

Co-authored-by: Arlo Phoenix <[email protected]>

* replace list with single tensor

* update

* llama : minor styling

* convert : add sanity check for query_pre_attn_scalar

* fix small typo in README

---------

Co-authored-by: Arlo Phoenix <[email protected]>
Co-authored-by: Georgi Gerganov <[email protected]>
* CUDA: refactor and optimize IQ MMVQ

* uint -> uint32_t

* __dp4a -> ggml_cuda_dp4a

* remove MIN_CC_DP4A checks

* change default

* try CI fix
* fix gemma2 tokenizer convert

* remove scores

* improve code, fix new line issue
* use warp_size macro for all sycl kernels

* fix mask of permute_sub_group_by_xor

* fix rms_norm with correct warp number

* fix rms_norm_f32/group_norm_f32

* move norm to norm.cpp file

* fix quantize bug

* fix mmvq's batch size
* fix win build conflict of math library

* fix the condition: !(win32 & SYCL)

* revert warp_size=16
* convert-hf : print output file name when completed

This commit adds the output file name to the log message when the
conversion is completed.

The motivation for this change is that when `--outfile` option is not
specified it migth not be obvious where the output file is written.

With this change the output of running the script will be something like
the following:
```console
INFO:hf-to-gguf:Model successfully exported to models/gemma-2-9b-it.gguf.
```

Signed-off-by: Daniel Bevenius <[email protected]>

* squash! convert-hf : print output file name when completed

Updates the output of to support printing the directory if the output is
split into multiple files. Also the output file name is now retrieved
from the model_instance object.

Signed-off-by: Daniel Bevenius <[email protected]>

* squash! convert-hf : print output file name when completed

Use parent attribute of Path object and string interpolation.

Signed-off-by: Daniel Bevenius <[email protected]>

* squash! convert-hf : print output file name when completed

Use os.sep instead of hardcoding the path separator.

Signed-off-by: Daniel Bevenius <[email protected]>

---------

Signed-off-by: Daniel Bevenius <[email protected]>
* Add `JAIS` model(s)

* cleanup

* address review comments

* remove hack

* un-hardcode max-alibi-bias

* minor tweaks

---------

Co-authored-by: fmz <[email protected]>
@Nexesenex Nexesenex merged commit 8385690 into Nexesenex:marstream Jul 2, 2024
12 of 15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.