Chameleon #246

Nexesenex · 2024-07-18T13:09:34Z

No description provided.

* lora: load to devide buft * add patch tensor function * correct tensor patch * llama_lora_adapter_apply * correct ggml_backend_tensor_copy * add llm_build_mm * fix auto merge * update based on review comments * add convert script * no more transpose A * add f16 convert * add metadata check * add sanity check * fix ftype * add requirements * fix requirements * fix outfile * conversion: only allow selected models * fix types * cuda : do not use dmmv if the tensor does not have enough cols * llama : lora fixes * do not disable mmap with lora Co-authored-by: slaren <[email protected]> * llm_build_lora_mm_id * convert_lora : MoE LoRA conversion support * convert_lora : prefer safetensors, similarly to convert_hf * convert_hf : simplify modify_tensors for InternLM2 * convert_lora : lazy conversion * llama : load and use alpha from LoRA adapters * llama : use llm_build_lora_mm in most model graphs * auto scale * Revert "auto scale" This reverts commit 42415a4. * remove redundant params * Apply suggestions from code review Co-authored-by: slaren <[email protected]> * change kv metadata * move add_type to __init__ * convert_hf : move add_type to main() * convert_lora : use the GGUFWriter from Model instead of overwriting it --------- Co-authored-by: slaren <[email protected]> Co-authored-by: Francis Couture-Harpin <[email protected]>

* convert_hf : faster lazy safetensors This makes '--dry-run' much, much faster. * convert_hf : fix memory leak in lazy MoE conversion The '_lazy' queue was sometimes self-referential, which caused reference cycles of objects old enough to avoid garbage collection until potential memory exhaustion.

The --help option on export-lora isn't accepted as valid. The help still gets displayed by default, but the script exits with an error message and nonzero status.

…nov#8491) * Update clib.json to point to Cyan4973 original xxhash Convinced Cyan4973 to add clib.json directly to his repo, so can now point the clib package directly to him now. Previously pointed to my fork with the clib.json package metadata Cyan4973/xxHash#954 * gguf-hash: readme update to point to Cyan4973 xxHash repo [no ci]

* [CANN] Add Ascend NPU backend Ascend is a full-stack AI computing infrastructure for industry applications and services based on Huawei Ascend processors and software. CANN (Compute Architecture of Neural Networks), developped by Huawei, is a heterogeneous computing architecture for AI. Co-authored-by: wangshuai09 <[email protected]> * delete trailing whitespaces * Modify the code based on review comment * Rename LLAMA_CANN to GGML_CANN * Make ggml-common.h private * add ggml_cann prefix for acl funcs * Add logging for CANN backend * Delete Trailing whitespace --------- Co-authored-by: wangshuai09 <[email protected]>

nopperl and others added 28 commits July 15, 2024 12:12

convert chameleon hf to gguf

385c1a8

add chameleon tokenizer tests

568110a

fix lint

fc09437

implement chameleon graph

0453f7d

fix ci (ggerganov#8494)

4db8f60

add swin norm param

654b1b3

llama : valign + remove unused ftype (ggerganov#8502)

0efec57

export-lora : handle help argument (ggerganov#8497)

37b12f9

The --help option on export-lora isn't accepted as valid. The help still gets displayed by default, but the script exits with an error message and nonzero status.

make/cmake: add missing force MMQ/cuBLAS for HIP (ggerganov#8515)

5e116e8

llama : disable context-shift for DeepSeek v2 (ggerganov#8501)

d65a836

batched: fix n_predict parameter (ggerganov#8527)

da3913d

return qk norm weights and biases to original format

c460d5c

implement swin norm

3d3523e

suppress image token output

758612a

CONTRIBUTING.md : remove mention of noci (ggerganov#8541)

30f80ca

rem tabs

90766e1

build : Fix docker build warnings (ggerganov#8535) (ggerganov#8537)

b328344

lookup: fibonacci hashing, fix crashes (ggerganov#8548)

e02b597

Merge branch 'master' into chameleon

b16c09a

add comment to conversion

126201d

fix ci

da5e356

check for k norm separately

fa568f6

adapt to new lora implementation

f40cd20

fix layer input for swin norm

15260c5

Nexesenex merged commit b568ee8 into Nexesenex:lcpp_pr_chameleon Jul 18, 2024
8 of 11 checks passed

github-actions bot added the testing label Jul 18, 2024

github-actions bot added examples python server ggml devops build labels Jul 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chameleon #246

Chameleon #246

Nexesenex commented Jul 18, 2024

Chameleon #246

Chameleon #246

Conversation

Nexesenex commented Jul 18, 2024