Skip to content

Tandem curriculum fix — seed 123 rerun#1689

Open
tcapelle wants to merge 2 commits intonoamfrom
noam-r23/tandem-curr-fix-seed123
Open

Tandem curriculum fix — seed 123 rerun#1689
tcapelle wants to merge 2 commits intonoamfrom
noam-r23/tandem-curr-fix-seed123

Conversation

@tcapelle
Copy link
Contributor

@tcapelle tcapelle commented Mar 20, 2026

Hypothesis

The tandem curriculum fix (PR #1674) achieved val_loss=0.8332, only +0.0006 above baseline 0.8326. This is a third seed point to establish the distribution of outcomes for this fix. Between the original run and seed 42, we need at least one more data point to distinguish signal from noise.

If 2 out of 3 seeds beat baseline, we merge. If all 3 cluster near baseline, the fix is neutral-to-slightly-positive and we still merge for correctness. If all 3 regress, we close.

Instructions

Apply the tandem curriculum fix to train.py:

Lines 712-714 — Replace:

if epoch < 10:
    is_tandem_curr = (x[:, :, -8:].abs().sum(dim=(1, 2)) > 0.01)
    sample_mask = (~is_tandem_curr).float()[:, None, None]

With:

if epoch < 10:
    is_tandem_curr = (x[:, 0, 21].abs() > 0.5)
    sample_mask = (~is_tandem_curr).float()[:, None, None]

Add seed at the top of the script (after imports, before any torch calls, around line 43):

torch.manual_seed(123)
torch.cuda.manual_seed_all(123)

Run with --wandb_group noam-r23-tandem-curr-seed123.

Baseline

  • val_loss = 0.8326
  • in_dist surf_p = 17.94
  • ood_cond surf_p = 13.98
  • ood_re surf_p = 27.54
  • tandem surf_p = 36.73
  • Previous run (default seed): val_loss=0.8332, in_dist=17.14

Results

W&B run: 5nf58eb6 (runtime: 32.0 min, state: failed due to pre-existing vis crash)

Split loss surf_Ux surf_Uy surf_p vol_Ux vol_Uy vol_p
val_in_dist 0.5952 6.11 1.75 18.73 0.98 0.34 19.79
val_ood_cond 0.7079 3.29 1.06 14.24 0.65 0.26 11.88
val_ood_re 0.5436 2.89 0.94 27.87 0.78 0.36 46.81
val_tandem_transfer 1.5866 5.58 2.28 38.23 1.72 0.80 37.07
val_loss (best) 0.8583

mean3_p = (18.73+14.24+38.23)/3 = 23.73

vs baseline: val_loss +0.0257 (+3.1%), mean3_p +0.85 (+3.7%) — worse

What happened

Seed 123 gives a much worse result (0.8583 vs 0.8326 baseline) compared to the default-seed run (0.8332). This is a large variance between seeds for what should be the same fix. The seed 123 result is well outside the noise range.

Summary of tandem-curr-fix seed sweep:

  • Default seed: 0.8332 (+0.0006, neutral/marginal)
  • Seed 123: 0.8583 (+0.0257, clearly worse)

The high variance between seeds suggests the fix interacts with initialization in unpredictable ways. Seed 123 leads to an initialization that somehow conflicts with the corrected curriculum — perhaps the tandem-only early training (epochs 0-10) with this seed creates a suboptimal basin that's hard to escape.

Suggested follow-ups

  • Review if seed 42 result (if available) would complete the picture
  • Or accept the fix is neutral-to-slightly-positive on default seed and merge for code correctness only

@tcapelle tcapelle added status:wip Student is working on it student:askeladd Assigned to askeladd noam Noam advisor branch experiments labels Mar 20, 2026
@github-actions
Copy link

github-actions bot commented Mar 20, 2026


Thank you for your submission, we really appreciate it. Like many open-source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution. You can sign the CLA by just posting a Pull Request Comment same as the below format.


I have read the CLA Document and I hereby sign the CLA


0 out of 2 committers have signed the CLA.
❌ @senpai-advisor
❌ @senpai-askeladd
senpai-advisor, senpai-askeladd seem not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You can retrigger this bot by commenting recheck in this Pull Request. Posted by the CLA Assistant Lite bot.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@tcapelle tcapelle marked this pull request as ready for review March 20, 2026 15:23
@tcapelle tcapelle added status:review Ready for advisor review and removed status:wip Student is working on it labels Mar 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

noam Noam advisor branch experiments status:review Ready for advisor review student:askeladd Assigned to askeladd

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant