vllm-project / vllm Public

Notifications You must be signed in to change notification settings
Fork 5.3k
Star 34.7k

Code
Issues 1.2k
Pull requests 472
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Pull requests: vllm-project/vllm

Labels 56 Milestones 0

New pull request New

472 Open 5,371 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[Build/CI] Fix libcuda.so linkage ci/build ready

ONLY add when PR is ready to merge/full CI is needed

#12424 opened Jan 25, 2025 by tlrmchlsmth

Loading…

[ROCm][AMD][Model] llama 3.2 support upstreaming

#12421 opened Jan 24, 2025 by maleksan85

Loading…

Fix the pydantic logging validator frontend

#12420 opened Jan 24, 2025 by maxdebayser

Loading…

[Misc] Add offline test for disaggregated prefill

#12418 opened Jan 24, 2025 by Shaoting-Feng

Loading…

[Bugfix] Disable w16a16 2of4 sparse CompressedTensors24 ready

ONLY add when PR is ready to merge/full CI is needed

#12417 opened Jan 24, 2025 by tlrmchlsmth

Loading…

[V1][Metrics] Add initial Prometheus logger ready

ONLY add when PR is ready to merge/full CI is needed

#12416 opened Jan 24, 2025 by markmc • Draft

[V1] Revert uncache_blocks and support recaching full blocks ready

ONLY add when PR is ready to merge/full CI is needed

#12415 opened Jan 24, 2025 by comaniac

Loading…

[Usage] Add pipeline parallelism for usage stats

#12414 opened Jan 24, 2025 by simon-mo

Loading…

[Frontend] Support override generation config in args ready

ONLY add when PR is ready to merge/full CI is needed

#12409 opened Jan 24, 2025 by liuyanyi

Loading…

[Bugfix] Fix benchmark script bug: inaccurate stats for vllm backend when max_model_len < input_len + output_len

#12407 opened Jan 24, 2025 by WangErXiao

Loading…

[ci/build] detect and auto use cxx abi ci/build

#12403 opened Jan 24, 2025 by youkaichao

Loading…

[MISC] add arg pad_for_invariant_seq_len

#12397 opened Jan 24, 2025 by MengqingCao

Loading…

[Bugfix] Fix output_tokens is 0 if using tgi backend

#12394 opened Jan 24, 2025 by sywangyi

Loading…

[torch.compile] PyTorch 2.6 and nightly compatibility

#12393 opened Jan 24, 2025 by youkaichao

Loading…

[Hardware][Intel GPU] add XPU bf16 support documentation

Improvements or additions to documentation

#12392 opened Jan 24, 2025 by jikunshang

Loading…

[V1][Core] Structured decoding on scheduler-level

#12388 opened Jan 24, 2025 by aarnphm • Draft

[Misc] Add BNB quantization for Whisper

#12381 opened Jan 24, 2025 by jeejeelee • Draft

[Frontend] Rerank API (Jina- and Cohere-compatible API) documentation

Improvements or additions to documentation

frontend

#12376 opened Jan 24, 2025 by K-Mistele

Loading…

[Core] add and implement VLLM_LOGITS_PROCESSOR_THREADS

#12368 opened Jan 23, 2025 by akeshet

Loading…

[Hardware][Intel-Gaudi] Enable FusedSDPA support for Intel Gaudi (HPU)

#12359 opened Jan 23, 2025 by SanjuCSudhakaran • Draft

[ROCm] Faster Custom Paged Attention kernels ci/build rocm

#12348 opened Jan 23, 2025 by tjtanaa • Draft

[Bugfix] handle alignment of arguments in convert_sparse_cross_attention_mask_to_dense

#12347 opened Jan 23, 2025 by tjohnson31415

Loading…

FLOP counting for vLLM inference

#12341 opened Jan 23, 2025 by dianastea • Draft

[Build] Only build 9.0a for scaled_mm and sparse kernels ci/build ready

ONLY add when PR is ready to merge/full CI is needed

#12339 opened Jan 23, 2025 by LucasWilkinson

Loading…

[Frontend] Generate valid tool call IDs when using tokenizer-mode=mistral frontend

#12332 opened Jan 22, 2025 by rafvasq

Loading…

Previous 1 2 3 4 5 … 18 19 Next

Previous Next

ProTip! Exclude everything labeled bug with -label:bug.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly