Record: 0.9623 BPB — 7-Gram Entropy Cache + XSA-all + EBLS by Robby955 · Pull Request #777 · openai/parameter-golf

Robby955 · 2026-03-25T22:02:26Z

Summary

val_bpb: 0.9623 (3-seed mean, std 0.0009)
Seeds: 1337 (0.9614), 2025 (0.9624), 2024 (0.9631)
All artifacts under 16MB (~15.87 MB)
8×H100 SXM, ~560s training + ~300s eval
Script: 1,420 lines

Technique

7-gram entropy-adaptive causal cache (PPM variant) blended with neural model during sliding-window eval:

Hash-table n-gram backoff (orders 2→7, 4M buckets each)
Entropy-adaptive alpha: α = 0.05 + 0.55·σ(2·(H − 4.0)) — trust cache more when model is uncertain
Strictly backward-looking: cache updated only after each token is scored
No oracle/min(NLL) selection — single blended prediction per token

Training stack: EBLS layer sharing (3 shared blocks × 3 loops), LoRA rank 8, XSA-all(11), LeakyReLU(0.5)², val-calibrated GPTQ int6, LZMA compression.

Compliance

Training within 600s wall-clock (560s used)
All artifacts < 16,000,000 bytes
Script < 1,500 lines (1,420)
No TTT on validation data
No training data at eval time
No min(NLL) oracle — single prediction per token
Cache is strictly causal (backward-looking only)
GPTQ calibration within training window on val data

Score Decomposition

Stage	BPB
Pre-quant (fp32)	~1.14
Post-quant (int6)	~1.14
+ Sliding window	~1.14
+ 7-gram entropy cache	~0.96

The ~0.18 BPB improvement from the cache captures document-local regularities (repeated phrases, consistent terminology) that the neural model's fixed context window handles imperfectly.

Credits

N-gram cache: PR Record: XSA-all + LeakyReLU² + VR + GA + 7-gram cache (val_bpb=1.0337) #715, PR Record: First Legal Sub-1.0 BPB — Multi-order N-gram Backoff + Entropy-Adaptive Alpha (val_bpb=0.9674, 3-seed) #727
Entropy-adaptive alpha: PR Record: First Legal Sub-1.0 BPB — Multi-order N-gram Backoff + Entropy-Adaptive Alpha (val_bpb=0.9674, 3-seed) #727, suggested by valerio-oai on PR Record: 5-gram Eval Cache + LeakyReLU² + Parallel Muon val_bpb: 1.0920 (3-seed mean, std 0.0007) | ~15.9 MB | 8×H100 SXM #659
XSA-all: PR Record: 11L XSA-all + Full GPTQ (Budget-Legal) + Parallel Muon + Selective Pruning (val_bpb: 1.1178, 3-seed mean) #634 by @raahilshah
LeakyReLU²: PR Record: 11L EMA + Int6 + XSA + LeakyReLU² + Partial RoPE (val_bpb: 1.1309) #493 by @parinzee
Base model: PR Record: 11L EMA + GPTQ-lite + warmdown3500 + QAT@0.15 (val_bpb=1.1233) #414 by @signalrush

🤖 Generated with Claude Code

…ll + EBLS 3-seed mean: 0.9623 (std 0.0009) Seeds: 1337 (0.9614), 2025 (0.9624), 2024 (0.9631) All artifacts under 16MB. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Robby955 · 2026-03-27T20:56:43Z

Superseded by neural-track work.

Record Submission: 0.9623 BPB - 7-Gram Entropy-Adaptive Cache + XSA-a…

2e3d0cc

…ll + EBLS 3-seed mean: 0.9623 (std 0.0009) Seeds: 1337 (0.9614), 2025 (0.9624), 2024 (0.9631) All artifacts under 16MB. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

notapplica mentioned this pull request Mar 25, 2026

⛳ Parameter Golf Live AI Commentary ⛳ + Analysis / Ideas | every 10 minutes #140

Open

Robby955 closed this Mar 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Record: 0.9623 BPB — 7-Gram Entropy Cache + XSA-all + EBLS#777

Record: 0.9623 BPB — 7-Gram Entropy Cache + XSA-all + EBLS#777
Robby955 wants to merge 1 commit intoopenai:mainfrom
Robby955:record/7gram-entropy-ebls-0.9623

Robby955 commented Mar 25, 2026

Uh oh!

Robby955 commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Robby955 commented Mar 25, 2026

Summary

Technique

Compliance

Score Decomposition

Credits

Uh oh!

Robby955 commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant