Skip to content

Commit

Permalink
llama : use reserve/emplace_back in sampler_sample (ggml-org#9534)
Browse files Browse the repository at this point in the history
This commit updates the llama_sampler_sample function to use reserve and
emplace_back for the vector of llama_token_data structs.

The motivation for this change is to avoid the creation of n_vocab
default-constructed llama_token_data structs which are then
immediately overwritten.
  • Loading branch information
danbev authored and arthw committed Nov 15, 2024
1 parent 7440115 commit 25a2012
Showing 1 changed file with 3 additions and 2 deletions.
5 changes: 3 additions & 2 deletions src/llama-sampling.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -236,9 +236,10 @@ llama_token llama_sampler_sample(struct llama_sampler * smpl, struct llama_conte
const int n_vocab = llama_n_vocab(llama_get_model(ctx));

// TODO: do not allocate each time
std::vector<llama_token_data> cur(n_vocab);
std::vector<llama_token_data> cur;
cur.reserve(n_vocab);
for (llama_token token_id = 0; token_id < n_vocab; token_id++) {
cur[token_id] = llama_token_data{token_id, logits[token_id], 0.0f};
cur.emplace_back(llama_token_data{token_id, logits[token_id], 0.0f});
}

llama_token_data_array cur_p = {
Expand Down

0 comments on commit 25a2012

Please sign in to comment.