-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Pull requests: vllm-project/vllm-ascend
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
support bailing25 quant
module:quantization
#8685
opened Apr 24, 2026 by
alex101-ops
Contributor
Loading…
[CI][Cherry-pick] Relax TTFT benefits threshold from 0.4 to 0.5 to account for DP load imbalance
#8684
opened Apr 24, 2026 by
underfituu
Contributor
Loading…
[CI][Main] Relax TTFT benefits threshold from 0.4 to 0.5 to account for DP load imbalance
module:tests
#8683
opened Apr 24, 2026 by
underfituu
Contributor
Loading…
[CI] add nightly MiniMax-M2.5-w8a8-QuaRot
ci/build
module:tests
nightly-test
#8681
opened Apr 24, 2026 by
weixinAc
Loading…
[CI] Add nightly case:GLM-5_1-W8A8
ci/build
module:tests
nightly-test
#8680
opened Apr 24, 2026 by
guxin108
Contributor
Loading…
[BugFix] Fix DSV3.1 W4A8 TTFT degradation
ready
read for review
ready-for-test
start test by label for PR
#8675
opened Apr 24, 2026 by
wangbj127
Contributor
Loading…
[v0.18.0][BugFix] Fix DSV3.1 W4A8 TTFT degradation
ready
read for review
ready-for-test
start test by label for PR
#8674
opened Apr 24, 2026 by
wangbj127
Contributor
Loading…
[CI] repair ci customop for main
module:tests
#8673
opened Apr 24, 2026 by
ZT-AIA
Contributor
Loading…
[Doc][0.18.0] Fix the wrong triton uninstall guide
#8672
opened Apr 24, 2026 by
Tflowers-0129
Contributor
Loading…
In scenarios A2 and A3, replace npu_fusion_attention with the _npu_flash_attention_unpad operator.
module:ops
#8671
opened Apr 24, 2026 by
chenxi-hh
Collaborator
Loading…
[Doc]Update Qwen3-Omni-30B-A3B-Thinking.md
#8669
opened Apr 24, 2026 by
tanhaoan333
Collaborator
Loading…
[BugFix][Eagle3] Add fullgraph case and check mock function
module:tests
#8668
opened Apr 24, 2026 by
lilinsiman
Collaborator
Loading…
[CI] add nightly case: Kimi-2.5
ci/build
module:tests
nightly-test
#8667
opened Apr 24, 2026 by
chen-commits
Loading…
[Test]Add quantization test case
module:tests
#8666
opened Apr 24, 2026 by
kunpengW-code
Contributor
Loading…
Fix formatting of gpu-memory-utilization flag
documentation
Improvements or additions to documentation
#8665
opened Apr 24, 2026 by
zkryakgul
Loading…
[Ops][BugFix] Fix QwenVL models weight_scale reshape
module:quantization
#8661
opened Apr 24, 2026 by
ksiyuan
Contributor
Loading…
[Doc] Fix documentation formatting and improve code examples
documentation
Improvements or additions to documentation
module:tests
#8660
opened Apr 24, 2026 by
MrZ20
Contributor
Loading…
[Attention][Feature] adapt bailing_moe_linear on Ascend
module:core
module:ops
#8657
opened Apr 24, 2026 by
ghphotoframe
Contributor
Loading…
[BugFix] Routing replay support multi dp & tp
module:ops
#8651
opened Apr 24, 2026 by
cocacolafan
Loading…
[BugFix]Correct A5 MLAPO when using latest PTA
#8647
opened Apr 24, 2026 by
lijiahang226
Contributor
Loading…
[Ops][BugFix] Fix the issue of overlapping CPU binding ranges in specific scenarios
module:core
#8645
opened Apr 24, 2026 by
Rozwel-dx
Contributor
Loading…
[Doc] Translated Doc files 2026-04-24
documentation
Improvements or additions to documentation
#8644
opened Apr 24, 2026 by
vllm-ascend-ci
Collaborator
Loading…
[BugFix][Sample] Preserve logprobs mode in Ascend sampler backend
module:tests
#8643
opened Apr 24, 2026 by
xunxunboy
Loading…
Previous Next
ProTip!
Updated in the last three days: updated:>2026-04-21.