Skip to content

[DO NOT MERGE!!!] Introduce tilesPerWarp parameter, scale preshuffling and bypassingLDS#809

Draft
plognjen wants to merge 3 commits intoshared/triton-gfx950-launchfrom
shared/preshuffle-scales-new
Draft

[DO NOT MERGE!!!] Introduce tilesPerWarp parameter, scale preshuffling and bypassingLDS#809
plognjen wants to merge 3 commits intoshared/triton-gfx950-launchfrom
shared/preshuffle-scales-new

Conversation

@plognjen
Copy link

No description provided.

int32_t kSize = shape[kDim];
auto warpsPerCTA = mfmaLayout.getWarpsPerCTA();
auto tilesPerWarp = mfmaLayout.getTilesPerWarp();
auto tilePerWarpNonK = tilesPerWarp[kDim];

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this tilesPerWarp is for mfmaLayout, which does not have "kDim". Can you be more specific here?

auto regs = mlir::triton::identityStandardND(kRegister, {1, 1}, order);
LinearLayout lanes = LinearLayout::empty();
auto kDim = dotOperandIdx == 0 ? rank - 1 : rank - 2;
auto tilePerWarpNonK = tilesPerWarp[kDim];

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here we are dealing with operands and there is kDim and nonKDim. If you need nonKDim, you should use tilesPerWarp[nonKDim]

@antiagainst antiagainst force-pushed the shared/triton-gfx950-launch branch from 77c00fa to a259f0a Compare May 26, 2025 17:58
@plognjen plognjen force-pushed the shared/preshuffle-scales-new branch from f259473 to 7edf97c Compare May 26, 2025 19:41
zhanglx13 and others added 2 commits May 27, 2025 07:53
- Only swap elems in getValuesFromDotOperandLayoutStruct if preshuffle
env var is set
- Only set tilesPerWarp = [2, 2] for scaledDot with mxfp4 case
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants