Skip to content

Non-record: XSA-all + mHC + Full QAT (val_bpb=1.1211)#928

Open
autocode-rayes wants to merge 1 commit intoopenai:mainfrom
autocode-rayes:mhc-xsa-all-fullqat
Open

Non-record: XSA-all + mHC + Full QAT (val_bpb=1.1211)#928
autocode-rayes wants to merge 1 commit intoopenai:mainfrom
autocode-rayes:mhc-xsa-all-fullqat

Conversation

@autocode-rayes
Copy link
Copy Markdown

Three changes on PR #549 stack:

  • XSA on all 11 layers (was last 4)
  • Manifold-constrained hyper-connections (22 extra params)
  • Full-training QAT (LATE_QAT_THRESHOLD=1.0)

Seed 1337: sliding_window=1.1229, legal_ttt=1.1211
Artifact: 15.95 MB, 8xH100 SXM, 600s train + 482s eval

Three changes on PR openai#549 stack:
- XSA on all 11 layers (was last 4)
- Manifold-constrained hyper-connections (22 extra params)
- Full-training QAT (LATE_QAT_THRESHOLD=1.0)

Seed 1337: sliding_window=1.1229, legal_ttt=1.1211
Artifact: 15.95 MB, 8xH100 SXM, 600s train + 482s eval

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant