-
Notifications
You must be signed in to change notification settings - Fork 10.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add cvector-generator
example
#7514
Merged
ngxson
merged 58 commits into
ggerganov:master
from
ngxson:xsn/control-vector-generator
Jun 15, 2024
Merged
Changes from all commits
Commits
Show all changes
58 commits
Select commit
Hold shift + click to select a range
0a46d73
add control-vector-generator
ngxson c31c118
calc diff
ngxson b30bea3
add comments
ngxson 73747fe
proof-of-concept stdlib implementation
christianazinn f58f6af
param parsing, refactor, comments
christianazinn dc46264
example template completions
christianazinn 447023f
add multi prompts, multi-thread for PCA
ngxson 287da25
fix mem error
ngxson d446c6d
add debugs
ngxson 31f153f
fix matrix transpose multiplication
christianazinn fa85ba6
preliminary template/multiprompt support
christianazinn 4d88cd1
fix zero output & param parsing, functional templating
christianazinn 4d7d71b
fix square_diff matmul index range and CRLF->LF line endings
christianazinn 6256036
add command-line args for num threads, num completions file lines, al…
christianazinn db3ba10
code aestheticization
christianazinn 86842b2
fix compiler warnings
christianazinn 5442688
in-series multithreading for prompt embedding?
christianazinn 3090c48
remove unnecessary multithreading
christianazinn df623ff
interim fix memory leak
christianazinn 0e1f973
translated everything but PCA (I think)
christianazinn b67ea65
tentatively translate the rest
christianazinn a23c72e
fix ggml errors and make new ones
christianazinn 15d5c25
fix cb_eval
ngxson 07dba13
temporary commit while I move dev environments
christianazinn 23fd1b5
update debug statements
christianazinn 3815a0c
pre-tokenize so we can allocate correct memory to ctx_diffs_wrapped
christianazinn a42e783
update comments
christianazinn a710df7
(wip) refactor
ngxson c241b50
clean up PCA ggml implementation
ngxson 6a5adf3
fix shape of v_diff_original
ngxson 9e39571
add n_batch for pca
ngxson 1a088fb
working version
ngxson 1639168
remember to copy back the last_eigenvector
ngxson 446da90
fix n_completions
christianazinn d41c719
bring back n_completions
ngxson 3223133
default n_pca_batch to 20
ngxson da6babd
fix macos build
ngxson 85db22d
Merge branch 'master' into xsn/control-vector-generator
ngxson 54f77e2
add to makefile all targets
ngxson 04c91d2
use ggml_format_name
ngxson 5ffba9e
add readme
ngxson e9cb3b3
fix .editorconfig
ngxson 7297817
use ggml_backend_tensor_copy
ngxson e683b9a
attemp to fix compile problem on mac
ngxson 8ee0c96
fix compile warn
ngxson f54cb8e
reuse allocr
ngxson 679f513
move param parser to common
ngxson a2a5f1b
better error handling
ngxson b22c845
clean up a bit
ngxson c59bfa6
add print_usage
ngxson 334dbae
shorten help msg
ngxson 25fb0a6
beautify help msg
ngxson ca86d4f
escape prompt by default
ngxson 2f05558
Merge branch 'master' into xsn/control-vector-generator
ngxson 64cad20
change compile target to llama-cvector-generator
ngxson 91f7dbf
typo
ngxson f99be2c
disable GPU for PCA
ngxson 6d2464a
code style
ngxson File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
set(TARGET llama-cvector-generator) | ||
add_executable(${TARGET} cvector-generator.cpp pca.hpp) | ||
install(TARGETS ${TARGET} RUNTIME) | ||
target_link_libraries(${TARGET} PRIVATE common llama ${CMAKE_THREAD_LIBS_INIT}) | ||
target_compile_features(${TARGET} PRIVATE cxx_std_11) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
# cvector-generator | ||
|
||
This example demonstrates how to generate a control vector using gguf models. | ||
|
||
Related PRs: | ||
- [Add support for control vectors](https://github.com/ggerganov/llama.cpp/pull/5970) | ||
- (Issue) [Generate control vector using llama.cpp](https://github.com/ggerganov/llama.cpp/issues/6880) | ||
- [Add cvector-generator example](https://github.com/ggerganov/llama.cpp/pull/7514) | ||
|
||
## Examples | ||
|
||
```sh | ||
# CPU only | ||
./cvector-generator -m ./dolphin-2.0-mistral-7b.Q4_K_M.gguf | ||
|
||
# With GPU | ||
./cvector-generator -m ./dolphin-2.0-mistral-7b.Q4_K_M.gguf -ngl 99 | ||
|
||
# With advanced options | ||
./cvector-generator -m ./dolphin-2.0-mistral-7b.Q4_K_M.gguf -ngl 99 --completions 128 --pca-iter 2000 --batch-pca 100 | ||
|
||
# To see help message | ||
./cvector-generator -h | ||
# Then, have a look at "cvector" section | ||
``` | ||
|
||
## Tips and tricks | ||
|
||
If you have multiple lines per prompt, you can escape the newline character (change it to `\n`). For example: | ||
|
||
``` | ||
<|im_start|>system\nAct like a person who is extremely happy.<|im_end|> | ||
<|im_start|>system\nYou are in a very good mood today<|im_end|> | ||
``` |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@calvin-laurenson I ended up enabling escape new line by default, which should be more convenient for most users.