Turbo-Muon + EngramLite + ParamBanking + GPTQ Reserve Opt — val_bpb 1.1126 (3-seed mean) by Bortlesboat · Pull Request #1169 · openai/parameter-golf

Bortlesboat · 2026-03-31T03:38:33Z

Summary

val_bpb: 1.1126 (3-seed mean, std 0.0003)
Artifact: ~15.98 MB (all seeds under 16,000,000 bytes)
Eval time: ~120s (no TTT, sliding window stride=64)
Built on PR #1089 by @mikeapedia

3-Seed Results

Seed	Sliding BPB	val_loss (nats)	Artifact
1337	1.1126	1.87857	15,981,856
42	1.1123	1.87803	15,984,349
999	1.1129	1.87900	15,985,912
Mean	1.1126	1.87853

vs merged SOTA (PR #549, 1.89002 nats): -0.01149 nats. Note: open PRs #1089 (1.1091) and #1105 (1.1138) achieve better scores.

What's New vs PR #1089

GPTQ Reserve Optimization: Reduced calibration reserve from 14s to 9s (actual calibration ~8.4s), recovering ~55 extra training steps
Experimental fused Triton MLP kernel: Forward-only fusion via torch.library.triton_op with standard PyTorch backward. Hard-disabled in this submission (produces NaN on PT2.9 due to TTIR analysis bug). Included as experimental code for future work.

Compliance

Standard F.cross_entropy scoring
No TTT, no eval-time training data access
Artifact < 16,000,000 bytes (all 3 seeds)
Training < 600s, eval < 600s
Causal sliding-window evaluation (stride=64)
3-seed verification: -0.01149 nats vs merged SOTA (> 0.005 threshold)

Credits

PR #1089 by @mikeapedia (Turbo-Muon, EngramLite, ParamBanking)
PR #1072 by @vimeto (fused Triton kernel design)
PR #1105 by @abaybektursun (forward-only fusion insight)
PR #549 by @abaybektursun (base scaffold)

….1126 3-seed results: 1.1126/1.1123/1.1129 (mean 1.1126, std 0.0003) Built on PR openai#1089 with GPTQ reserve optimization (14s to 9s). Includes experimental fused Triton MLP kernel (hard-disabled).

Turbo-Muon + EngramLite + ParamBanking + GPTQ Reserve Opt — val_bpb 1…

c41b299

….1126 3-seed results: 1.1126/1.1123/1.1129 (mean 1.1126, std 0.0003) Built on PR openai#1089 with GPTQ reserve optimization (14s to 9s). Includes experimental fused Triton MLP kernel (hard-disabled).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Turbo-Muon + EngramLite + ParamBanking + GPTQ Reserve Opt — val_bpb 1.1126 (3-seed mean)#1169

Turbo-Muon + EngramLite + ParamBanking + GPTQ Reserve Opt — val_bpb 1.1126 (3-seed mean)#1169
Bortlesboat wants to merge 1 commit intoopenai:mainfrom
Bortlesboat:submission/v18-turbomuon-fused-1.1126

Bortlesboat commented Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Bortlesboat commented Mar 31, 2026

Summary

3-Seed Results

What's New vs PR #1089

Compliance

Credits

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant