Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[Perf] Slight improvement of ITL with multiple GPUs
#31826 opened Jan 6, 2026 by access2rohit Loading…
2 of 5 tasks
[Model] Enable LoRA support for tower and connector in DotsOCR documentation Improvements or additions to documentation
#31825 opened Jan 6, 2026 by ShaanveerS Loading…
[CI] Add CUDA 13 nightly containers ci/build nvidia
#31822 opened Jan 6, 2026 by csahithi Loading…
5 tasks
[Bugfix] Handle mistral tokenizer in get_hf_processor multi-modality Related to multi-modality (#4194) ready ONLY add when PR is ready to merge/full CI is needed v1
#31817 opened Jan 6, 2026 by DarkLight1337 Loading…
5 tasks
[ROCm][AITER] bugfix accuracy regression in ROCM_AITER_TRITON_MLA backend rocm Related to AMD ROCm v1
#31816 opened Jan 6, 2026 by vllmellm Loading…
5 tasks
Enable LoRA support for tower and connector in Mistral and Voxtral deepseek Related to DeepSeek models documentation Improvements or additions to documentation frontend qwen Related to Qwen models
#31812 opened Jan 6, 2026 by Anexdeus Loading…
Report error log after vllm bench serve performance Performance-related issues ready ONLY add when PR is ready to merge/full CI is needed
#31808 opened Jan 6, 2026 by elvircrn Loading…
5 tasks
[Core][NIXL] Support HMA+NixlConnector ci/build deepseek Related to DeepSeek models documentation Improvements or additions to documentation frontend kv-connector llama Related to Llama models multi-modality Related to multi-modality (#4194) needs-rebase nvidia performance Performance-related issues qwen Related to Qwen models structured-output tool-calling tpu Related to Google TPUs v1
#31802 opened Jan 6, 2026 by NickLucche Draft
6 tasks
[Doc] Update release docs documentation Improvements or additions to documentation
#31799 opened Jan 6, 2026 by DarkLight1337 Loading…
5 tasks
[Chore] Try remove init_cached_hf_modules ready-run-all-tests Trigger CI with all tests for wide-ranging PRs tpu Related to Google TPUs v1
#31786 opened Jan 6, 2026 by DarkLight1337 Loading…
5 tasks
[Fix] Use torch.empty for output in attention+quant fusion
#31785 opened Jan 6, 2026 by elvischenv Loading…
5 tasks
[Kernel] Support bias type in grouped_topk kernel
#31781 opened Jan 6, 2026 by xyang16 Loading…
5 tasks
[Refactor] GLM-ASR Modeling
#31779 opened Jan 6, 2026 by JaredforReal Loading…
5 tasks
fixing stream interval > 1 will cause tool call bug
#31778 opened Jan 6, 2026 by MrIceCreamMan Loading…
3 of 5 tasks
[Model] Enable LoRA support for LLaVA family documentation Improvements or additions to documentation
#31772 opened Jan 6, 2026 by ppppqp Loading…
5 tasks done
[perf] Fused operator SplitMrope used in the Qwen2.5-Omni-7B model qwen Related to Qwen models
#31763 opened Jan 6, 2026 by fuzhihong699 Draft
5 tasks
[Frontend] Add MCP tool streaming support to Responses API frontend gpt-oss Related to GPT-OSS models
#31761 opened Jan 6, 2026 by daniel-salib Loading…
ProTip! Filter pull requests by the default branch with base:main.