Skip to content

Convert per-block routing storage from Tensor to ndarray#2

Open
lmcafee-nvidia wants to merge 3 commits intosidsingh-nvidia:siddharth/support-nemo-rl-router-replayfrom
lmcafee-nvidia:prefix-caching-router-record
Open

Convert per-block routing storage from Tensor to ndarray#2
lmcafee-nvidia wants to merge 3 commits intosidsingh-nvidia:siddharth/support-nemo-rl-router-replayfrom
lmcafee-nvidia:prefix-caching-router-record

Conversation

@lmcafee-nvidia
Copy link

Summary

  • Converts per-block MoE routing storage (block_routing dict, store_block_routing, get_block_routing, _reconstruct_routing_from_blocks, _store_routing_per_block) from torch.Tensor to np.ndarray, aligning with PR Dump router recording data into a ray store for Nemo-RL NVIDIA/Megatron-LM#3925's conversion of routing_indices to np.ndarray.
  • Updates all 7 TestPerBlockRouting tests to use numpy (np.int16 dtype, np.allclose, np.random.randint).

Test plan

  • All 7 TestPerBlockRouting tests pass (pytest ... ::TestPerBlockRouting -xvs)

🤖 Generated with Claude Code

lmcafee-nvidia and others added 3 commits March 19, 2026 12:24
…ility.

Move routing indices from per-request step-by-step accumulation to
per-block storage on KVBlockAllocator. At request completion, routing
is reconstructed by concatenating per-block routing in block order.
Matched (prefix-cached) blocks retain routing from the original request,
so reconstruction naturally covers all tokens including skipped prefixes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Aligns with PR NVIDIA#3925's conversion of routing_indices to np.ndarray.
Changes block_routing dict, store/get/reconstruct methods, and
_store_routing_per_block to use numpy throughout.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@lmcafee-nvidia lmcafee-nvidia force-pushed the prefix-caching-router-record branch from 69a0403 to 2d90296 Compare March 19, 2026 19:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant