Skip to content

Support alternate 2D linear checkpoint layouts#166

Open
lesj0610 wants to merge 1 commit intoturboderp-org:devfrom
lesj0610:fix/checkpoint-layout-2d-linears
Open

Support alternate 2D linear checkpoint layouts#166
lesj0610 wants to merge 1 commit intoturboderp-org:devfrom
lesj0610:fix/checkpoint-layout-2d-linears

Conversation

@lesj0610
Copy link
Contributor

@lesj0610 lesj0610 commented Mar 7, 2026

This is a small compatibility improvement for architecture-preserving checkpoint format variation. The goal is not to support arbitrary model-specific rewrites, but to make linear weight loading more robust when modified checkpoints keep the same module graph while changing storage layout.

Some modified checkpoints keep the same architecture but store 2D linear weights in the opposite orientation. This can happen after resaves, merges, abliterations, repacks, or other tooling that preserves the module graph but rewrites tensor storage.

This keeps the configured orientation by default and only flips when the configured orientation does not fit the expected padded shape and the opposite orientation does.

The change is limited to 2D linear loads used by the existing FP16 path. Fused tensors and checkpoints that already match the current orientation keep the existing behavior.

Adds focused tests for the flip and no-flip cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant