Skip to content

Non-record: 11L Int5 QAT + Score-First TTT — val_bpb 1.1356 (15.60 MiB)#1041

Closed
JoeProAI wants to merge 1 commit intoopenai:mainfrom
JoeProAI:submission/joeproai-11l-int5-ttt-1.1356-seed314
Closed

Non-record: 11L Int5 QAT + Score-First TTT — val_bpb 1.1356 (15.60 MiB)#1041
JoeProAI wants to merge 1 commit intoopenai:mainfrom
JoeProAI:submission/joeproai-11l-int5-ttt-1.1356-seed314

Conversation

@JoeProAI
Copy link
Copy Markdown

11L U-Net + Int5 QAT + Score-First Legal TTT

val_bpb: 1.13557402 | 15.60 MiB (16,361,752 bytes) | 8xH100 (~33 min) | seed 314

Same architecture and config as PR #861 (seed 42, 1.13256182). Independent seed validation run.

Architecture

Param Value
Layers 11
Model dim 512
Heads 8
MLP hidden 1536
Bigram buckets 4096
Bigram embed dim 128
Vocab size 256
Tie embeddings false

Rule Compliance

  • Score-first TTT: tokens scored under inference_mode() before training on them
  • No val tokens used in artifact or training
  • No pre-eval adaptation
  • Artifact: 15.60 MiB (under 16 MiB limit)
  • Training time: ~1999s (under 600s eval budget)

Train script, submission.json, and run_training.sh included.

@JoeProAI JoeProAI closed this Mar 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant