-
-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Build/CI] Fix libcuda.so linkage
ci/build
ready
ONLY add when PR is ready to merge/full CI is needed
#12424
opened Jan 25, 2025 by
tlrmchlsmth
Loading…
[Misc] Add offline test for disaggregated prefill
#12418
opened Jan 24, 2025 by
Shaoting-Feng
Loading…
[Bugfix] Disable w16a16 2of4 sparse CompressedTensors24
ready
ONLY add when PR is ready to merge/full CI is needed
#12417
opened Jan 24, 2025 by
tlrmchlsmth
Loading…
[V1][Metrics] Add initial Prometheus logger
ready
ONLY add when PR is ready to merge/full CI is needed
[V1] Revert ONLY add when PR is ready to merge/full CI is needed
uncache_blocks
and support recaching full blocks
ready
#12415
opened Jan 24, 2025 by
comaniac
Loading…
[Frontend] Support override generation config in args
ready
ONLY add when PR is ready to merge/full CI is needed
#12409
opened Jan 24, 2025 by
liuyanyi
Loading…
[Bugfix] Fix benchmark script bug: inaccurate stats for vllm backend when max_model_len < input_len + output_len
#12407
opened Jan 24, 2025 by
WangErXiao
Loading…
[Bugfix] Fix output_tokens is 0 if using tgi backend
#12394
opened Jan 24, 2025 by
sywangyi
Loading…
[torch.compile] PyTorch 2.6 and nightly compatibility
#12393
opened Jan 24, 2025 by
youkaichao
Loading…
[Hardware][Intel GPU] add XPU bf16 support
documentation
Improvements or additions to documentation
#12392
opened Jan 24, 2025 by
jikunshang
Loading…
[Frontend] Rerank API (Jina- and Cohere-compatible API)
documentation
Improvements or additions to documentation
frontend
#12376
opened Jan 24, 2025 by
K-Mistele
Loading…
[Core] add and implement
VLLM_LOGITS_PROCESSOR_THREADS
#12368
opened Jan 23, 2025 by
akeshet
Loading…
[Hardware][Intel-Gaudi] Enable FusedSDPA support for Intel Gaudi (HPU)
#12359
opened Jan 23, 2025 by
SanjuCSudhakaran
•
Draft
[Bugfix] handle alignment of arguments in convert_sparse_cross_attention_mask_to_dense
#12347
opened Jan 23, 2025 by
tjohnson31415
Loading…
[Build] Only build 9.0a for scaled_mm and sparse kernels
ci/build
ready
ONLY add when PR is ready to merge/full CI is needed
#12339
opened Jan 23, 2025 by
LucasWilkinson
Loading…
[Frontend] Generate valid tool call IDs when using
tokenizer-mode=mistral
frontend
#12332
opened Jan 22, 2025 by
rafvasq
Loading…
Previous Next
ProTip!
Exclude everything labeled
bug
with -label:bug.