SqueezeAILab / SqueezeLLM Public

Notifications You must be signed in to change notification settings
Fork 43
Star 651

Code
Issues 15
Pull requests 3
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Issues: SqueezeAILab/SqueezeLLM

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

15 Open 12 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

When I use SqueezeLLM to quantize the LLaMA2-13B model and test it, the speed is extremely slow.

#71 opened Jul 3, 2024 by zhangfzR

Can you update the version that can quant OPT family?

#70 opened Jun 26, 2024 by Chen-1031

how can I get the models of 0.45% sparsity by myself?

#69 opened Jun 17, 2024 by LiMa-cas

Why do LLaMA-2-7B have s0 quantized models, but no s5 and s45 sparsity quantized models?

#68 opened May 26, 2024 by Evane5cence

Further speeding up the quantization process

#67 opened May 5, 2024 by SyphonArch

Installation instructions did not lead to the local transformers version being selected, giving errors

#66 opened Apr 9, 2024 by RDouglasSharp

Support JAIS models

#65 opened Mar 24, 2024 by 7ossam81

Dense-only quantization bit precision

#63 opened Mar 5, 2024 by akarkim

D+S packing in vLLM seems buggy

#62 opened Feb 27, 2024 by MingLin-home

sample_weight is negative when running kmeans clustering

#61 opened Feb 23, 2024 by MingLin-home

A question about LLaMA-2-7B and Mistral models only provide Dense-only (0%) quantized models

#56 opened Feb 4, 2024 by WeiMa01

On A100 card, speed-up effect does not show up.

#51 opened Nov 30, 2023 by leocnj

Future plan for this project

#45 opened Nov 1, 2023 by tjtanaa

Vicuna-1.5?

#44 opened Oct 29, 2023 by mlinmg

Minor bug for --include_sparse

#39 opened Aug 11, 2023 by vuiseng9

ProTip! no:milestone will show everything without a milestone.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly