Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

b3579 #290

Merged
merged 10 commits into from
Aug 12, 2024
Merged

b3579 #290

merged 10 commits into from
Aug 12, 2024

Conversation

Nexesenex
Copy link
Owner

No description provided.

compilade and others added 10 commits August 11, 2024 14:45
* gguf-py : Numpy dequantization for most types

* gguf-py : Numpy dequantization for grid-based i-quants
* py : fix requirements check '==' -> '~='

* cont : fix the fix

* ci : run on all requirements.txt
* readme: introduce gpustack

GPUStack is an open-source GPU cluster manager for running large
language models, which uses llama.cpp as the backend.

Signed-off-by: thxCode <[email protected]>

* readme: introduce gguf-parser

GGUF Parser is a tool to review/check the GGUF file and estimate the
memory usage without downloading the whole model.

Signed-off-by: thxCode <[email protected]>

---------

Signed-off-by: thxCode <[email protected]>
* llama : model-based max number of graph nodes calculation

* Update src/llama.cpp

---------

Co-authored-by: slaren <[email protected]>
@Nexesenex Nexesenex merged commit 8408090 into Nexesenex:spacestream Aug 12, 2024
23 of 31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants