Skip to content

Commit

Permalink
enable parallel prefill again
Browse files Browse the repository at this point in the history
Differential Revision: D61751873

Pull Request resolved: pytorch#4893
  • Loading branch information
kimishpatel authored Aug 27, 2024
1 parent f92139f commit 395d3f5
Show file tree
Hide file tree
Showing 2 changed files with 1 addition and 2 deletions.
2 changes: 1 addition & 1 deletion examples/models/llama2/runner/runner.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -126,7 +126,7 @@ Error Runner::load() {
tokenizer_.get(),
text_decoder_runner_.get(),
metadata_.at(kUseKVCache),
enable_parallel_prefill_);
metadata_.at(kEnableDynamicShape));

text_token_generator_ = std::make_unique<TextTokenGenerator>(
tokenizer_.get(),
Expand Down
1 change: 0 additions & 1 deletion examples/models/llama2/runner/runner.h
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,6 @@ class Runner {

private:
float temperature_;
bool enable_parallel_prefill_;
bool shouldStop_{false};

// model
Expand Down

0 comments on commit 395d3f5

Please sign in to comment.