[BugFix] Handle unscheduled requests properly when async scheduling #27756

njhill · 2025-10-29T16:22:02Z

There may be circumstances other then preemption where a running request is temporarily not scheduled in the batch for some step(s). These need to be handled similarly to the async scheduling + preemption fix made in #26385.

This PR also streamlines how resumed requests are recorded in CachedRequestData.

Signed-off-by: Nick Hill <nhill@redhat.com>

njhill · 2025-10-29T16:22:48Z

vllm/v1/core/sched/scheduler.py

+                    threshold = self.scheduler_config.long_prefill_token_threshold
+                    if 0 < threshold < num_new_tokens:
+                        num_new_tokens = threshold


Unrelated simplification, hurt me to look at that formatting :)

njhill · 2025-10-29T16:24:05Z

vllm/v1/core/sched/output.py

+    # the request's block IDs. For those in the set, new_block_ids will be used as the
    # request's block IDs instead of appending to the existing block IDs.
-    resumed_from_preemption: list[bool]
+    resumed_req_ids: set[str]


Changing this to a set since these will be rare and we currently are creating a [None] * batch_size list every time.

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Nick Hill <nhill@redhat.com>

Signed-off-by: Nick Hill <nhill@redhat.com>

mergify · 2025-10-29T20:25:20Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @njhill.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

# Conflicts: # vllm/distributed/kv_transfer/kv_connector/v1/shared_storage_connector.py

njhill · 2025-10-30T01:01:39Z

vllm/v1/core/sched/output.py

+    @cached_property
+    @deprecated("use resumed_req_ids field")
+    def resumed_from_preemption(self) -> list[bool]:
+        return [req_id in self.resumed_req_ids for req_id in self.req_ids]
+
+    @cached_property
+    @deprecated("use all_token_ids field")
+    def resumed_req_token_ids(self) -> list[list[int] | None]:
+        return [
+            self.all_token_ids[req_id] if req_id in self.resumed_req_ids else None
+            for req_id in self.req_ids
+        ]


These are for backwards compatibility.

benchislett

LGTM, discussed online. I would like to see a test case for the new coverage, can go ahead and merge at any point.

njhill · 2025-10-30T04:03:39Z

For follow-on:

Maybe rename new CachedRequestData fields
Unit test covering preempted/unscheduled cases

…llm-project#27756) Signed-off-by: Nick Hill <nhill@redhat.com>

…equests properly when async scheduling #27756 (#507) Culprit commit: vllm-project/vllm#27756 --------- Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai> Signed-off-by: Michał Kuligowski <michal.kuligowski@intel.com> Signed-off-by: Agata Dobrzyniewicz <160237065+adobrzyn@users.noreply.github.com> Co-authored-by: Michał Kuligowski <michal.kuligowski@intel.com>

[BugFix] Handle unscheduled requests properly when async scheduling

c81f17a

Signed-off-by: Nick Hill <nhill@redhat.com>

njhill added the bug Something isn't working label Oct 29, 2025

njhill requested review from ApostaC, NickLucche, WoosukKwon, alexm-redhat, comaniac, heheda12345, robertgshaw2-redhat and ywang96 as code owners October 29, 2025 16:22

mergify bot added v1 tpu Related to Google TPUs kv-connector labels Oct 29, 2025

njhill commented Oct 29, 2025

View reviewed changes

This comment was marked as resolved.

Sign in to view

njhill commented Oct 29, 2025

View reviewed changes

This comment was marked as resolved.

Sign in to view

njhill and others added 2 commits October 29, 2025 09:43

Update vllm/v1/core/sched/output.py

b83085e

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Nick Hill <nhill@redhat.com>

test fixes

bb7eebe

Signed-off-by: Nick Hill <nhill@redhat.com>

njhill added suppress-bc-linter ready ONLY add when PR is ready to merge/full CI is needed labels Oct 29, 2025

njhill requested a review from benchislett October 29, 2025 19:57

njhill mentioned this pull request Oct 29, 2025

Async Scheduling Plan #27679

Open

10 tasks

mergify bot added the needs-rebase label Oct 29, 2025

Merge remote-tracking branch 'origin/main' into handle-unscheduled

975d768

# Conflicts: # vllm/distributed/kv_transfer/kv_connector/v1/shared_storage_connector.py

mergify bot removed the needs-rebase label Oct 29, 2025

njhill mentioned this pull request Oct 30, 2025

[Core] Async Scheduling X Spec Decoding Compatibility #24799

Open

5 tasks

njhill commented Oct 30, 2025

View reviewed changes

benchislett approved these changes Oct 30, 2025

View reviewed changes

njhill merged commit 2ce5c5d into vllm-project:main Oct 30, 2025
54 checks passed

njhill deleted the handle-unscheduled branch October 30, 2025 04:05

MatthewBonanni pushed a commit to MatthewBonanni/vllm that referenced this pull request Oct 30, 2025

[BugFix] Handle unscheduled requests properly when async scheduling (v…

cb338b9

…llm-project#27756) Signed-off-by: Nick Hill <nhill@redhat.com>

adobrzyn mentioned this pull request Oct 31, 2025

[FIX_FOR_VLLM_LATEST] Hourly fix after: [BugFix] Handle unscheduled requests properly when async scheduling #27756 vllm-project/vllm-gaudi#507

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[BugFix] Handle unscheduled requests properly when async scheduling #27756

[BugFix] Handle unscheduled requests properly when async scheduling #27756

njhill commented Oct 29, 2025 •

edited by github-actions bot

Loading

Uh oh!

njhill Oct 29, 2025

Uh oh!

This comment was marked as resolved.

Uh oh!

njhill Oct 29, 2025

Uh oh!

This comment was marked as resolved.

Uh oh!

mergify bot commented Oct 29, 2025

Uh oh!

njhill Oct 30, 2025

Uh oh!

benchislett left a comment

Uh oh!

njhill commented Oct 30, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

[BugFix] Handle unscheduled requests properly when async scheduling #27756

[BugFix] Handle unscheduled requests properly when async scheduling #27756

Conversation

njhill commented Oct 29, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

njhill Oct 29, 2025

Choose a reason for hiding this comment

Uh oh!

This comment was marked as resolved.

Uh oh!

njhill Oct 29, 2025

Choose a reason for hiding this comment

Uh oh!

This comment was marked as resolved.

Uh oh!

mergify bot commented Oct 29, 2025

Uh oh!

njhill Oct 30, 2025

Choose a reason for hiding this comment

Uh oh!

benchislett left a comment

Choose a reason for hiding this comment

Uh oh!

njhill commented Oct 30, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

njhill commented Oct 29, 2025 •

edited by github-actions bot

Loading