[DBO] Compile non-dbo cudagraphs for shapes that are close to dbo_decode_token_threshold #27771

SageMoore · 2025-10-29T23:04:23Z

Purpose

When num_tokens is near the dbo_decode_token_threshold, different ranks may make different microbatching decisions (some above threshold, some below). Since all ranks must agree for DBO to work, they'll all fall back to non-DBO execution. To avoid running without cudagraphs in these mixed cases, this PR adds logic to compile cudagraphs for both microbatching modes.

Size before
Graph capturing finished in 33 secs, took 2.46 GiB

Size after
Graph capturing finished in 35 secs, took 2.52 GiB

Test Plan

To test I ran lm_eval with Deepseek V2 Lite with DP=2 and dbo-decode-threshold=26. Since ranks usually get 25-30 tokens in this scenario, setting the threshold at 26 ensures some ranks will be above and some below, triggering the mixed-decision scenario. I added logging to the code and verified that we are now properly running with non-dbo cudagraphs when one rank is running with 25 tokens. I've also included lm eval results.

Test Result

|Tasks|Version|     Filter     |n-shot|  Metric   |   |Value |   |Stderr|
|-----|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|gsm8k|      3|flexible-extract|     5|exact_match|↑  |0.3733|±  |0.0280|
|     |       |strict-match    |     5|exact_match|↑  |0.3700|±  |0.0279|

Signed-off-by: Sage Moore <sage@neuralmagic.com>

SageMoore added 2 commits October 29, 2025 22:55

init

c4f3f12

Signed-off-by: Sage Moore <sage@neuralmagic.com>

remove log

e2a9730

Signed-off-by: Sage Moore <sage@neuralmagic.com>

mergify bot added the v1 label Oct 29, 2025

SageMoore added 3 commits October 29, 2025 23:21

refactoring

43e9551

Signed-off-by: Sage Moore <sage@neuralmagic.com>

misc fixes

ea89363

Signed-off-by: Sage Moore <sage@neuralmagic.com>

Merge branch 'main' into sage/non-dbo-cudagraphs

3b34963

SageMoore marked this pull request as ready for review October 30, 2025 18:41

Merge branch 'main' into sage/non-dbo-cudagraphs

d081bde

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[DBO] Compile non-dbo cudagraphs for shapes that are close to dbo_decode_token_threshold #27771

[DBO] Compile non-dbo cudagraphs for shapes that are close to dbo_decode_token_threshold #27771

SageMoore commented Oct 29, 2025 •

edited by github-actions bot

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

[DBO] Compile non-dbo cudagraphs for shapes that are close to dbo_decode_token_threshold #27771

Are you sure you want to change the base?

[DBO] Compile non-dbo cudagraphs for shapes that are close to dbo_decode_token_threshold #27771

Conversation

SageMoore commented Oct 29, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

SageMoore commented Oct 29, 2025 •

edited by github-actions bot

Loading