Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Make the cv2 dependency optional performance Performance-related issues
#27780 opened Oct 30, 2025 by cmpute Loading…
3 of 5 tasks
[Bugfix] change FlashMLA reorder_batch_threshold v1
#27777 opened Oct 30, 2025 by MatthewBonanni Loading…
5 tasks
[FEATURE] Upstream VIT FA RDNA3 ROCM ci/build qwen Related to Qwen models rocm Related to AMD ROCm
#27776 opened Oct 30, 2025 by JartX Draft
[Model] Add Gemma3 GGUF multimodal support ci/build v1
#27772 opened Oct 29, 2025 by lucianommartins Loading…
4 tasks done
[KV offload] Enable CPU KV offload on CUDA alike Platforms rocm Related to AMD ROCm v1
#27770 opened Oct 29, 2025 by zhewenl Loading…
Reapply "Install pre-built xformers-0.0.32.post2 built with pt-2.9.0" ci/build ready ONLY add when PR is ready to merge/full CI is needed
#27768 opened Oct 29, 2025 by huydhn Loading…
[CI Test] Add Scheduled Integration Test ci/build ready ONLY add when PR is ready to merge/full CI is needed
#27765 opened Oct 29, 2025 by yewentao256 Loading…
[Qwen][Multimodal] Move Qwen2_5_vl sdpa to custom op until tensor slicing supported qwen Related to Qwen models
#27764 opened Oct 29, 2025 by Lucaskabela Loading…
3 of 5 tasks
[BugFix] Stopgap - Flashinfer Autotuner + GPT-OSS + DP/TP gpt-oss Related to GPT-OSS models
#27762 opened Oct 29, 2025 by varun-sundar-rabindranath Loading…
[Multimodal] Make MediaConnector extensible. frontend multi-modality Related to multi-modality (#4194)
#27759 opened Oct 29, 2025 by huachenheli Loading…
5 tasks
[Model] Add PaddleOCR-VL Model Support new-model Requests to new models
#27758 opened Oct 29, 2025 by zhang-prog Loading…
[UX] Include NVTX in cuda.txt ci/build
#27757 opened Oct 29, 2025 by jeejeelee Loading…
5 tasks
[BugFix] Handle unscheduled requests properly when async scheduling bug Something isn't working kv-connector ready ONLY add when PR is ready to merge/full CI is needed suppress-bc-linter tpu Related to Google TPUs v1
#27756 opened Oct 29, 2025 by njhill Loading…
[Kernel] Enable FusedMoEModularKernel support bias
#27754 opened Oct 29, 2025 by jeejeelee Loading…
5 tasks
[Hybrid] Pass kernel block size to builders v1
#27753 opened Oct 29, 2025 by tdoublep Draft
5 tasks
reasoning_content -> reasoning deepseek Related to DeepSeek models documentation Improvements or additions to documentation frontend gpt-oss Related to GPT-OSS models qwen Related to Qwen models structured-output tool-calling v1
#27752 opened Oct 29, 2025 by hmellor Loading…
[Refactor] Remove VLLM_DEEPEP_LOW_LATENCY_ALLOW_NVLINK ready ONLY add when PR is ready to merge/full CI is needed
#27750 opened Oct 29, 2025 by yewentao256 Loading…
[test/dnm] do not merge: ci-infra dummy PR
#27749 opened Oct 29, 2025 by dougbtv Loading…
[Bugfix][ROCm] Fix ViT rotary embeddings for torch.compile compatibility on ROCm qwen Related to Qwen models rocm Related to AMD ROCm
#27748 opened Oct 29, 2025 by vllmellm Draft
5 tasks
Cleanup basic and entrypoint test organisation ci/build llama Related to Llama models tool-calling
#27747 opened Oct 29, 2025 by hmellor Loading…
[Model][Qwen3VL] Add torch.compile support for Qwen3VL qwen Related to Qwen models
#27741 opened Oct 29, 2025 by lgeiger Draft
[Qwen3-Next] MOE config for A100-SXM4-80GB TP4 qwen Related to Qwen models
#27740 opened Oct 29, 2025 by toulzx Loading…
ProTip! Mix and match filters to narrow down what you’re looking for.