From 753d5aa340fe72cd071bb0ab10aa07cb9caedd34 Mon Sep 17 00:00:00 2001 From: torchspec-bot <262938024+torchspec-bot@users.noreply.github.com> Date: Sat, 7 Mar 2026 14:09:47 -0800 Subject: [PATCH] [Doc] update README --- README.md | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/README.md b/README.md index 20af846..5f1362c 100644 --- a/README.md +++ b/README.md @@ -14,19 +14,19 @@ TorchSpec supports two inference backends: | Backend | Best For | Installation | |---------|----------|--------------| -| **SGLang** | Production workloads, high throughput | `./tools/build_conda.sh 1 sglang` (default) | | **vLLM** | Flexibility, easier deployment | `./tools/build_conda.sh 1 vllm` | +| **SGLang** | Production workloads, high throughput | `./tools/build_conda.sh 1 sglang` | | **Both** | Development, comparison testing | `./tools/build_conda.sh 1 both` | ### Quick Setup ```bash -# Install with SGLang (default) -./tools/build_conda.sh +# Install with vLLM +./tools/build_conda.sh 1 vllm micromamba activate torchspec -# Or install with vLLM -./tools/build_conda.sh 1 vllm +# Or install with SGLang +./tools/build_conda.sh micromamba activate torchspec ``` @@ -43,14 +43,14 @@ pip install -e ".[fa]" ### Backend-Specific Usage -**SGLang (default):** +**vLLM:** ```bash -./examples/qwen3-8b-single-node/run.sh +./examples/qwen3-8b-single-node/run.sh --config configs/vllm_qwen3_8b.yaml ``` -**vLLM:** +**SGLang:** ```bash -./examples/qwen3-8b-single-node/run.sh --config configs/vllm_qwen3_8b.yaml +./examples/qwen3-8b-single-node/run.sh ``` TorchSpec uses vLLM's **Worker Extension** mechanism to hook into the model's forward pass and capture hidden states directly in the worker processes. This avoids RPC serialization issues and enables reliable hidden states extraction.