Skip to content

Conversation

@njhill
Copy link
Member

@njhill njhill commented Oct 29, 2025

There may be circumstances other then preemption where a running request is temporarily not scheduled in the batch for some step(s). These need to be handled similarly to the async scheduling + preemption fix made in #26385.

This PR also streamlines how resumed requests are recorded in CachedRequestData.

Signed-off-by: Nick Hill <nhill@redhat.com>
@njhill njhill added the bug Something isn't working label Oct 29, 2025
@mergify mergify bot added v1 tpu Related to Google TPUs kv-connector labels Oct 29, 2025
Comment on lines +448 to +450
threshold = self.scheduler_config.long_prefill_token_threshold
if 0 < threshold < num_new_tokens:
num_new_tokens = threshold
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated simplification, hurt me to look at that formatting :)

gemini-code-assist[bot]

This comment was marked as resolved.

# the request's block IDs. For those in the set, new_block_ids will be used as the
# request's block IDs instead of appending to the existing block IDs.
resumed_from_preemption: list[bool]
resumed_req_ids: set[str]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing this to a set since these will be rare and we currently are creating a [None] * batch_size list every time.

chatgpt-codex-connector[bot]

This comment was marked as resolved.

njhill and others added 2 commits October 29, 2025 09:43
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: Nick Hill <nhill@redhat.com>
Signed-off-by: Nick Hill <nhill@redhat.com>
@njhill njhill added suppress-bc-linter ready ONLY add when PR is ready to merge/full CI is needed labels Oct 29, 2025
@njhill njhill requested a review from benchislett October 29, 2025 19:57
@njhill njhill mentioned this pull request Oct 29, 2025
10 tasks
@mergify
Copy link

mergify bot commented Oct 29, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @njhill.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Oct 29, 2025
# Conflicts:
#	vllm/distributed/kv_transfer/kv_connector/v1/shared_storage_connector.py
Comment on lines +120 to +131
@cached_property
@deprecated("use resumed_req_ids field")
def resumed_from_preemption(self) -> list[bool]:
return [req_id in self.resumed_req_ids for req_id in self.req_ids]

@cached_property
@deprecated("use all_token_ids field")
def resumed_req_token_ids(self) -> list[list[int] | None]:
return [
self.all_token_ids[req_id] if req_id in self.resumed_req_ids else None
for req_id in self.req_ids
]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are for backwards compatibility.

Copy link
Collaborator

@benchislett benchislett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, discussed online. I would like to see a test case for the new coverage, can go ahead and merge at any point.

@njhill
Copy link
Member Author

njhill commented Oct 30, 2025

For follow-on:

  • Maybe rename new CachedRequestData fields
  • Unit test covering preempted/unscheduled cases

@njhill njhill merged commit 2ce5c5d into vllm-project:main Oct 30, 2025
54 checks passed
@njhill njhill deleted the handle-unscheduled branch October 30, 2025 04:05
MatthewBonanni pushed a commit to MatthewBonanni/vllm that referenced this pull request Oct 30, 2025
adobrzyn added a commit to vllm-project/vllm-gaudi that referenced this pull request Oct 31, 2025
…equests properly when async scheduling #27756 (#507)

Culprit commit: vllm-project/vllm#27756

---------

Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai>
Signed-off-by: Michał Kuligowski <michal.kuligowski@intel.com>
Signed-off-by: Agata Dobrzyniewicz <160237065+adobrzyn@users.noreply.github.com>
Co-authored-by: Michał Kuligowski <michal.kuligowski@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working kv-connector ready ONLY add when PR is ready to merge/full CI is needed suppress-bc-linter tpu Related to Google TPUs v1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants