Skip to content

Conversation

@ad-astra-video
Copy link
Collaborator

@ad-astra-video ad-astra-video commented Oct 28, 2025

Two small updates in this:

  1. update to vllm 0.11.0.
  2. update to cuda 12.9.1 for docker image matching to vllm docker build
  3. ensures tensor and pipeline a parallel check does not fail to load pipeline
  4. removes max_num_batched_tokens = max_model_len to enable usage of chunked_prefill on by default
livepeer-ai-llm  | INFO 10-29 11:01:00 [model.py:1510] Using max model len 200000
livepeer-ai-llm  | INFO 10-29 11:01:00 [scheduler.py:205] Chunked prefill is enabled with max_num_batched_tokens=2048

Copilot AI review requested due to automatic review settings October 28, 2025 10:51
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR updates the LLM dependencies and Docker environment to support vLLM 0.11.0. The changes primarily involve upgrading vLLM from version 0.10.0 to 0.11.0 and updating the Docker base image to use CUDA 12.9.1 to match vLLM's requirements.

Key Changes:

  • Updated vLLM dependency from 0.10.0 to 0.11.0
  • Upgraded Docker base image from CUDA 12.1.1 to CUDA 12.9.1 with Ubuntu 22.04
  • Modified Python package version constraints from pinned (==) to minimum versions (>=)

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
runner/requirements.llm.in Updated vLLM version constraint from 0.10.0 to 0.11.0
runner/requirements.llm.txt Regenerated dependency lockfile (entire file removed, will be regenerated)
runner/docker/Dockerfile.llm Updated CUDA base image to 12.9.1 and changed package version constraints to use minimum versions

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@ad-astra-video
Copy link
Collaborator Author

@mikezupper when the new image builds for this PR can you test it by chance?

@ad-astra-video ad-astra-video changed the title LLM: update to vllm version 0.11.0, upgrade to cuda 12.8.1 LLM: update to vllm version 0.11.0, upgrade to cuda 12.9.1 Oct 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants