llama.cpp Discussions #23

MillionthOdin16 · 2023-04-04T22:49:37Z

MillionthOdin16
Apr 4, 2023

I created this as a place to discuss llama.cpp specific info that doesn't directly require action for this repo, but is still related :)

Performance wise, there's some cool stuff being investigated related to generation speed as the context grows. It looks like there's a significant drop in tokens/S, which was introduced in a last couple weeks. If they can find the cause, gen should be much faster :D

ggerganov/llama.cpp#603

MillionthOdin16 · 2023-04-05T20:14:52Z

MillionthOdin16
Apr 5, 2023
Author

Just FYI, llama CPP received a big bug fix in the last hour that significantly improves token per second performance at large context lengths 🔥

1 reply

abetlen Apr 5, 2023
Maintainer

Pushed

sergedc · 2023-04-20T02:14:36Z

sergedc
Apr 20, 2023

What version of llama.cpp is this python binding using?
I see a file called libllama.so (or libllama.dll in windows)
Is there a way to update it?

On my machine, the model load is 3 second using the llama.cpp but 40seconds using this python binding.
I know llama.cpp improved loading speed significantly recently.

1 reply

abetlen Apr 20, 2023
Maintainer

It depends on the version / where you install from but if you check the vendor/llama.cpp folder and select the commit / version that'll be the version installed.

hongbopeng · 2023-04-28T18:54:12Z

hongbopeng
Apr 28, 2023

any way to bring "-n N, --n_predict N: number of tokens to predict" to python interface? without it, it only retrieves a fixed max number of words not the full content. it is defaulted at 128, but should be set at -1 for getting full content.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama.cpp Discussions #23

{{title}}

Replies: 3 comments 2 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

llama.cpp Discussions #23

MillionthOdin16 Apr 4, 2023

Replies: 3 comments · 2 replies

MillionthOdin16 Apr 5, 2023 Author

abetlen Apr 5, 2023 Maintainer

sergedc Apr 20, 2023

abetlen Apr 20, 2023 Maintainer

hongbopeng Apr 28, 2023

MillionthOdin16
Apr 4, 2023

Replies: 3 comments 2 replies

MillionthOdin16
Apr 5, 2023
Author

abetlen Apr 5, 2023
Maintainer

sergedc
Apr 20, 2023

abetlen Apr 20, 2023
Maintainer

hongbopeng
Apr 28, 2023