🎯
Focusing
Highlights
- Pro
Pinned Loading
-
vllm-project/vllm
vllm-project/vllm PublicA high-throughput and memory-efficient inference and serving engine for LLMs
-
casper-hansen/AutoAWQ
casper-hansen/AutoAWQ PublicAutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
-
huggingface/transformers
huggingface/transformers Public🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
-
-
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.