vllm-project / vllm Public

Notifications You must be signed in to change notification settings
Fork 5.7k
Star 38.1k

Code
Issues 1.3k
Pull requests 512
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Pull requests: vllm-project/vllm

Labels 59 Milestones 0

New pull request New

512 Open 5,851 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[Bugfix] Fix VLLM_USE_MODELSCOPE issue

#13384 opened Feb 17, 2025 by r4ntix

Loading…

[Bugfix] Handle content type with optional parameters frontend

#13383 opened Feb 17, 2025 by zifeitong

Loading…

[CPU] Upgrade CPU backend to torch-2.6 ci/build

#13381 opened Feb 17, 2025 by bigPYJ1151 • Draft

[VLM] Check required fields before initializing field config in DictEmbeddingItems documentation

Improvements or additions to documentation

ready

ONLY add when PR is ready to merge/full CI is needed

#13380 opened Feb 17, 2025 by DarkLight1337

Loading…

Integrate the new ragged paged attention kernel with vLLM v1 on TPU ci/build v1

#13379 opened Feb 17, 2025 by vanbasten23 • Draft

[MISC] tiny fixes ready

ONLY add when PR is ready to merge/full CI is needed

#13378 opened Feb 17, 2025 by MengqingCao

Loading…

[V1][Core] Support offloading KV cache to CPU. needs-rebase v1

#13377 opened Feb 17, 2025 by mengzhu28 • Draft

3 tasks

[V1] Support bad_words in sampler v1

#13376 opened Feb 17, 2025 by 22quinn • Draft

1 task

set chunked_prefill off when use mla

#13374 opened Feb 17, 2025 by DragonFive

Loading…

[Bugfix] fix xpu communicator

#13368 opened Feb 17, 2025 by yma11

Loading…

[Quant] Arctic SupportsQuant

#13366 opened Feb 17, 2025 by kylesayrs

Loading…

[V1][Spec Decode] Optimize N-gram matching with Numba ci/build v1

#13365 opened Feb 17, 2025 by WoosukKwon • Draft

[V1][Spec decode] Move drafter to model runner v1

#13363 opened Feb 16, 2025 by WoosukKwon • Draft

[RFC][V1] LogitsProcessor interface needs-rebase RFC v1

#13360 opened Feb 16, 2025 by njhill • Draft

Make log statistics interval configurable v1

#13356 opened Feb 16, 2025 by Sakalya

Loading…

[Benchmark] Add LongBench to benchmark_serving

#13350 opened Feb 16, 2025 by YuhanLiu11

Loading…

[Misc] Avoid calling unnecessary hf_list_repo_files for local model path

#13348 opened Feb 16, 2025 by Isotr0py

Loading…

Use 88 as the line length to be compatible with Black

#13347 opened Feb 16, 2025 by houseroad

Loading…

[V1] Get input tokens from scheduler ready

ONLY add when PR is ready to merge/full CI is needed

#13339 opened Feb 15, 2025 by WoosukKwon

Loading…

[Quant] Molmo SupportsQuant

#13336 opened Feb 15, 2025 by kylesayrs

Loading…

[Bugfix]: DeepseekR1 model load fails with weights tied error

#13335 opened Feb 15, 2025 by cennn

Loading…

[Core] Faster logit_bias_logits_processor frontend

#13334 opened Feb 15, 2025 by xu-song

Loading…

[Kernel] moe wna16 cuda kernel ci/build

#13321 opened Feb 15, 2025 by jinzhen-lin

Loading…

[VLM] Support multimodal inputs for Florence-2 models

#13320 opened Feb 15, 2025 by Isotr0py • Draft

[Model] Add support for GraniteMoeShared models

#13313 opened Feb 15, 2025 by tjohnson31415

Loading…

Previous 1 2 3 4 5 … 20 21 Next

Previous Next

ProTip! Follow long discussions with comments:>50.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly