Skip to content

Commit

Permalink
llama : model-based max number of graph nodes calculation (ggerganov#…
Browse files Browse the repository at this point in the history
…8970)

* llama : model-based max number of graph nodes calculation

* Update src/llama.cpp

---------

Co-authored-by: slaren <[email protected]>
  • Loading branch information
nicoboss and slaren authored Aug 12, 2024
1 parent 84eb2f4 commit 0fd93cd
Showing 1 changed file with 2 additions and 7 deletions.
9 changes: 2 additions & 7 deletions src/llama.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3575,13 +3575,8 @@ namespace GGUFMeta {

using llama_buf_map = std::unordered_map<uint32_t, ggml_backend_buffer_t>;

// TODO: update when needed or think of some clever automatic way to do this
static size_t llama_model_max_nodes(const llama_model & /*model*/) {
//if (model.arch == LLM_ARCH_LLAMA && model.hparams.n_layer > ??) { // llama-3 405B
// return 32768;
//}

return 8192;
static size_t llama_model_max_nodes(const llama_model & model) {
return std::max<size_t>(8192, model.tensors_by_name.size()*5);
}

struct llama_model_loader {
Expand Down

0 comments on commit 0fd93cd

Please sign in to comment.