-
Notifications
You must be signed in to change notification settings - Fork 27.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache: new Cache format in decoder-only models #31421
Merged
zucchini-nlp
merged 44 commits into
huggingface:main
from
zucchini-nlp:dynamic_cache_decoder_only
Aug 7, 2024
Merged
Changes from 20 commits
Commits
Show all changes
44 commits
Select commit
Hold shift + click to select a range
183cd66
draft bart with new cache
zucchini-nlp 4578bca
add cache for decoder-only models
zucchini-nlp 9505ca4
revert utils
zucchini-nlp 2ab28f3
modify docstring
zucchini-nlp 5fe4e9e
revert bart
zucchini-nlp 09413c3
minor fixes
zucchini-nlp 3c27604
fix copies (not related)
zucchini-nlp 350acc5
revert tests
zucchini-nlp c0adf10
remove enc-dec related code
zucchini-nlp c18b177
remove bloom
zucchini-nlp 582f289
remove opt (enc-dec)
zucchini-nlp 3141a71
Merge remote-tracking branch 'upstream/main' into dynamic_cache_decod…
zucchini-nlp 33d54b4
update docstring
zucchini-nlp dd05e6b
git, codegen, gpt_neo, gpt_neox, gpj
zucchini-nlp cb878d5
clean up
zucchini-nlp 0588791
copied from statements
zucchini-nlp a27b47c
revert
zucchini-nlp 1abcf30
tmp
zucchini-nlp 00ed88c
update warning msg
zucchini-nlp 6c3b3aa
forgot git
zucchini-nlp fd5eeab
add more flags
zucchini-nlp e233f29
run-slow git,codegen,gpt_neo,gpt_neox,gpj
zucchini-nlp 356d578
add cache flag to VLMs
zucchini-nlp c906670
remove files
zucchini-nlp 08d9e6f
Merge branch 'main' into dynamic_cache_decoder_only
zucchini-nlp 56c05b2
style
zucchini-nlp 8510810
video LLMs also need a flag
zucchini-nlp cebb55d
style
zucchini-nlp 8fd9dd1
llava will go in another PR
zucchini-nlp 4b9ced1
Merge branch 'main' into dynamic_cache_decoder_only
zucchini-nlp aea219b
style
zucchini-nlp 4991863
[run-slow] codegen, falcon, git, gpt_neo, gpt_neox, gptj, idefics
zucchini-nlp ec306a2
Update src/transformers/models/gpt_neo/modeling_gpt_neo.py
zucchini-nlp cf793b7
copy from
zucchini-nlp c92409c
deprecate until v4.45 and warn if not training
zucchini-nlp c2b97e4
nit
zucchini-nlp 35b60de
fix test
zucchini-nlp d2fca9a
test static cache
zucchini-nlp 0933350
Merge branch 'main' into dynamic_cache_decoder_only
zucchini-nlp 42349d4
add more tests and fix models
zucchini-nlp 45c3a1b
fix copies
zucchini-nlp 5f22616
return sliding window mask
zucchini-nlp f5af6a2
run slow tests & fix + codestyle
zucchini-nlp 21b45c5
one more falcon fix for alibi
zucchini-nlp File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
from transformers import AutoTokenizer, BartForConditionalGeneration | ||
|
||
model = BartForConditionalGeneration.from_pretrained("facebook/bart-large-cnn").to("cuda:0") | ||
tokenizer = AutoTokenizer.from_pretrained("facebook/bart-large-cnn") | ||
|
||
ARTICLE_TO_SUMMARIZE = ( | ||
"PG&E stated it scheduled the blackouts in response to forecasts for high winds " | ||
"amid dry conditions. The aim is to reduce the risk of wildfires. Nearly 800 thousand customers were " | ||
"scheduled to be affected by the shutoffs which were expected to last through at least midday tomorrow." | ||
) | ||
inputs = tokenizer(ARTICLE_TO_SUMMARIZE, return_tensors="pt").to("cuda:0") | ||
|
||
# Generate Summary | ||
summary_ids = model.generate(**inputs, num_beams=1, do_sample=False, max_new_tokens=30, use_cache=False) | ||
out = tokenizer.batch_decode(summary_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0] | ||
print(out) |
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
zucchini-nlp marked this conversation as resolved.
Show resolved
Hide resolved
|
Large diffs are not rendered by default.
Oops, something went wrong.
zucchini-nlp marked this conversation as resolved.
Show resolved
Hide resolved
|
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
371 changes: 252 additions & 119 deletions
371
src/transformers/models/gpt_neox/modeling_gpt_neox.py
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Empty file.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These kinds of changes are from fix copies, and are not related at all to the PR. But let's leave it here as it's anyway related to code-consistency in the library